microorganisms

Article Comparative Genomics of the Genus Shows Wide Distribution of Biodegradation Traits

Daniel Garrido-Sanz , Miguel Redondo-Nieto , Marta Martín and Rafael Rivilla * Departamento de Biología, Facultad de Ciencias, Universidad Autónoma de Madrid, Darwin 2, 28049 Madrid, Spain; [email protected] (D.G.-S.); [email protected] (M.R.-N.); [email protected] (M.M.) * Correspondence: [email protected]

 Received: 15 April 2020; Accepted: 20 May 2020; Published: 21 May 2020 

Abstract: The genus Rhodococcus exhibits great potential for bioremediation applications due to its huge metabolic diversity, including biotransformation of aromatic and aliphatic compounds. Comparative genomic studies of this genus are limited to a small number of genomes, while the high number of sequenced strains to date could provide more information about the Rhodococcus diversity. Phylogenomic analysis of 327 Rhodococcus genomes and clustering of intergenomic distances identified 42 phylogenomic groups and 83 species-level clusters. Rarefaction models show that these numbers are likely to increase as new Rhodococcus strains are sequenced. The Rhodococcus genus possesses a small “hard” core genome consisting of 381 orthologous groups (OGs), while a “soft” core genome of 1253 OGs is reached with 99.16% of the genomes. Models of sequentially randomly added genomes show that a small number of genomes are enough to explain most of the shared diversity of the Rhodococcus strains, while the “open” pangenome and strain-specific genome evidence that the diversity of the genus will increase, as new genomes still add more OGs to the whole genomic set. Most rhodococci possess genes involved in the degradation of aliphatic and aromatic compounds, while short-chain alkane degradation is restricted to a certain number of groups, among which a specific particulate methane monooxygenase (pMMO) is only found in Rhodococcus sp. WAY2. The analysis of Rieske 2Fe-2S dioxygenases among rhodococci genomes revealed that most of these enzymes remain uncharacterized.

Keywords: Rhodococcus; comparative genomics; phylogenomics; biodegradation

1. Introduction Rhodococcus is a gram-positive genus within the class that is ubiquitously distributed in the environment. Strains from this genus have been isolated from a variety of habitats, including soils, oceans and fresh waters [1–3], as well as from the guts of insects or living in association with sea sponges [4,5]. Some species are known pathogens, including R. hoagii (formerly R. equi), which causes zoonotic infections in grazing animals [6,7], and R. fascians, the causing agent of leafy gall disease in plants [8,9]. In addition, multiple Rhodococcus species are known to degrade diverse organic compounds, including polychlorinated biphenyls (PCBs), polycyclic aromatic hydrocarbons (PAHs) and aliphatic hydrocarbons [10–12], making this genus a very promising tool for bioremediation purposes. The diverse number of niches that rhodococci are able to inhabit and their extensive catabolic potential are thought to be a consequence of their large genomes and the presence of multiple extrachromosomal elements that add new functional traits to the general content [13]. The of the Rhodococcus genus is constantly changing, due to the frequent description of novel species [14–16], which adds more complexity to the frequent reassignments and merge of species [17]. Examples of the inconsistency in the classification can be found in the report of an illegitimate genus name of Rhodococcus Zopf 1981 which postdates the homonym algal genus

Microorganisms 2020, 8, 774; doi:10.3390/microorganisms8050774 www.mdpi.com/journal/microorganisms Microorganisms 2020, 8, 774 2 of 16

Rhodococcus Hansgirg 1884 [18], and the proposed reclassification of Rhodococcus equi to the genus Prescottia [19]. However, the formal reclassification of R. equi into the species Rhodococcus hoagii [17] has further complicated this question, which awaits formal consideration [20]. Until these issues are resolved, Rhodococcus hoagii is still valid [21] and also the genus Rhodococcus Zopf 1981, which currently includes 66 validly named species, according to the List of Prokaryotic names with Standing in Nomenclature [22] (accessed in July 2019). Phylogenies of the Rhodococcus genus based on multilocus sequence analysis (MLSA) using the housekeeping genes 16S rRNA, secY, rpoC, and rpsA [23] or several universal protein sequences [20] have been used to address the phyletic relationship within strains from this genus and to identify a varying number of groups of species [20,24], providing more reliability than phylogenies based on the 16S rRNA gene [25,26]. However, the number of sequenced rhodococci allows now the use of whole-genome comparisons for a better understanding of their relatedness and divergence. In this sense, Average Nucleotide Identity (ANI) [27] has been used to identify seven clades within 59 Rhodococcus isolates [25], although in other proteobacterial genera, including Pseudomonas and Bradyrhizobium, the genome-to-genome blast distance phylogeny (GBDP) algorithm [28] has proven to be more reliable than ANI for establishing species and phylogenomic groups boundaries [29,30]. Comparative genomics have also been performed to assess the functional diversity of several rhodococcal groups [23,24]. However, these analyses are scarce and limited to a few genome comparisons, which do not represent the entire diversity of the genus. Therefore, a global comparison of Rhodococcus genomes is needed to better understand the differences in their lifestyles and catabolic potential and to further acknowledge their diversity. Among the different members of the Rhodococcus genus, we previously isolated the novel PCB degrader Rhodococcus sp. WAY2 from a biphenyl-degrading bacterial consortium [31]. Further analysis of its complete genome sequence revealed several genetic clusters and genes putatively involved in the biodegradation of various aromatic compounds and different chain-length alkanes [32]. Although most of these clusters have also been reported in other rhodococci [10,11,33], the distribution of these biodegradative traits among the Rhodococcus genus remains unexplored. In this work, we report a global comparative genomic study of the Rhodococcus genus, using more than 300 sequenced strains. By means of phylogenomics, digital DNA–DNA hybridization (dDDH) and the determination of clusters of orthologous groups (OGs), we explore its diversity. Finally, we analyze the distribution of certain genes and gene clusters relevant for the biodegradation of aromatic and aliphatic compounds among Rhodococcus genomes to characterize their distribution among the genus.

2. Materials and Methods

2.1. Datasets All sequenced Rhodococcus genomes, proteomes, and annotations were downloaded from the RefSeq (GeneBank when RefSeq not available) NCBI ftp server [34] in June 2019. Duplicated type strain genomes from different culture collections were removed based on the number of contigs, removing those with a higher number, likely underrepresenting the strain genome, resulting in a total of 327 genomes listed in Supplementary Table S1.

2.2. Phylogenomic Analysis The 327 Rhodococcus genomes were compared using the Genome-to-genome Blast Distance Phylogeny (GBDP) algorithm [28] via the Genome-to-genome Distance Calculator (GGDC) web service [35]. The resulting sets of intergenomic distances (Supplementary Table S2) were converted into a matrix and imported into MEGA X software [36] to build a Neighbor–Joining (NJ) phylogenomic tree. Nocardia brasiliensis ATCC 700358 was used as outgroup. In addition, GBDP was also used to calculate the digital DNA–DNA hybridization (dDDH) values among all genome pairwise comparisons. Microorganisms 2020, 8, 774 3 of 16

2.3. Clustering of Rhodococcus Genomes Clustering of GBDP intergenomic distances from the Rhodococcus genus at species level (70% dDDH) and into phylogenomic groups was examined using the OPTSIL clustering software (version 1.5, Available online: http://www.goeker.org/mg/clustering/)[37]. An average-linkage clustering (i.e., F = 0.5) was chosen, as previously proposed [29,38] and clustering threshold (T) values from 0 to 0.2, using a step size of 0.0005 were evaluated. The best T for both species and phylogenomic groups were selected based on reference partitions that yielded the highest Modified Rand Index (MRI) score, used to measure the stability of similarity of partitions. Interpolation and extrapolation analyses of the species and phylogenomic groups clusters were inferred using the iNEXT R package [39], with a bootstrap of 1000 replicates and a confidence interval of 95%.

2.4. Orthologous Groups Identification and Genome Fractions Given the large number of genomes used in the study, for the identification of orthologous groups, genomes with more than 75 scaffolds (90 genomes) were removed to avoid misrepresentation of genomic fractions. Proteomes of the 237 resulting Rhodococcus genomes were compared using the OrthoFinder software (version 2.3.3, Available online: https://github.com/davidemms/ OrthoFinder)[40], using diamond [41] searchers and the MCL graph clustering algorithm [42]. Resulting orthologous clusters were queried with an in-house designed R script to obtain the core, pangenome, and group-specific genome fractions over 300 randomly sampled genomes (i.e., 300 indices of the 237 genomes randomly selected, were constructed and queried independently to obtain the number of orthologous groups of each genome fraction). The mean, Q1, and Q3 statistics of the 300 curves for each genome fraction were calculated and then represented using the ggplot2 R package [43]. The orthologous groups identified and the R script used to calculate the genome fractions have been included in the Supplementary File S1 and Supplementary File S2, respectively. Hierarchical clustering of selected orthologous groups was performed using the pheatmap R package [44].

2.5. Phylogeny of Single-Copy Genes Orthologous sequences of 212 single copy genes present in all the genomes were used to construct a phylogenetic tree. Amino acid sequences of the 212 single copy genes were aligned using the Clustal Omega software [45] and then concatenated. The resulting alignment of concatenated sequences was examined to remove poorly aligned columns and highly divergent regions with the gblocks v0.91 software [46], using a minimum block length of two amino acids and allowing gap positions in all sequences. The resulting matrix was imported into the Pthreads-parallelized RAxML v8.2.12 [47] to build a maximum-likelihood (ML) phylogenetic tree, using the LG model of amino acid evolution [48] combined with gamma-distributed substitution rates and empirical frequencies of amino acids. Fast bootstrapping was applied, followed by the search for the best-scoring tree [49] and the autoMRE criterium [50] were applied. Tree inference was calculated using the CIPRES Science Getaway [51]. Results were imported into MEGA X software to draw the tree.

2.6. Diversity of Rieske 2Fe-2S Dioxygenases The orthologous group containing Rieske 2Fe-2S dioxygenase homologous sequences previously identified was used to construct a maximum-likelihood phylogenetic tree, using the same methods and parameters specified above. Identical sequences were removed, and highly divergent regions were conserved to avoid the removal of divergent sequences given the diversity of the sequences analyzed. Microorganisms 2020, 8, 774 4 of 16

3. Results and Discussion

3.1. Phylogenomic Analysis and Clustering of the Rhodococcus Genus The phylogenomic GBDP-based analysis of 327 Rhodococcus genomes and further clustering of the intergenomic distances (Supplementary Table S1) revealed the presence of 42 phylogenomic groups (PGs) and 83 species-level clusters (Figures1 and2). The 42 PGs are in total agreement with the reference partition according to the Modified Rand Index (i.e., MRI = 1) using a distance threshold T between 0.1395 and 0.143, which correspond to a 29.8% and 30.5% dDDH, respectively (Figure2). This result is similar to the threshold identified for phylogroups clustering in the genera Pseudomonas and Bradyrhizobium (Garrido-Sanz et al., 2016; Garrido-Sanz et al., 2019). These 42 PGs contain 22 single-genome clusters, some of which are composed of a type strain alone, and 20 others with more than one genome. Only 18 PGs contain type sequenced strain genomes and, according to the oldest species description, these are named R. fascians, (PG 2), R. kyotonensis (PG 7), R. yunnanensis (PG 8), R. corynebacterioides (PG 13), R. globerulus (PG 16), R. erythropolis (PG 18), R. marinonascens (PG 19), R. opacus (PG 22), R. rhodochrous (PG 23), R. coprophilus (PG 25), R. ruber (PG 26), R. triatomae (PG 28), R. maashanensis (PG 29), R. tukisamuensis (PG 30), R. defluvii (PG 36), R. agglutinans (PG 37), R. hoagii (PG 39), R. kunmingensis (PG 40), and R. rhodnii (PG 41, Figure1). The genome of Rhodococcus sp. WAY2 [32] is clustered with Rhodococcus sp. S2-17 and corresponds to the PG 21. Some of the PGs identified in this work are in agreement with a previous study conducted by Creason et al., 2014, which identified seven main clades within the Rhodococcus genus using 59 genomes based on whole-genome comparisons [25]. Clade I corresponds to PG 1 (sub-clades ii, iii and iv) and PG 2-R. fascians (sub-clade i), and clade II corresponds to PG 12. These two clades were phylogenetically close, as is the case of the PG 1 to PG 12 in our analyses, which share an ancestral node. Clades III, IV, V, VI, and VII identified by Creason et al., 2014 correspond to PG 18-R. erythropolis (clade III), PG 22-R. opacus (clade IV), PGs 39, 40 and 41 (R. hoagii, R. kunmingensis and R. rhodnii, all included in clade V), PG 26-R. ruber (clade VI), and PG 23-R. rhodochrous (clade VII), respectively. The remaining PGs identified in our analysis are probably missing from the previous study due to their smaller dataset. However, the fact that both analyses found the same phylogenomic groups supports their status. On the other hand, we identified 83 species-level clusters within the 327 Rhodococcus genomes. These clusters were established with the conventional threshold of 70% dDDH, which corresponds to a distance of 0.036 between genomes. The clustering result is in total agreement with the reference partition (i.e., MRI = 1, Figure2). Thirty of these clusters contain sequenced type strains genomes, while the remaining 53, either correspond to previously not sequenced type strains or are novel species, which should be properly validated in accordance with standards in nomenclature. Surprisingly, several genomes of type strain species clustered together, achieving dDDH% values higher than 70% (Supplementary Table S2). These include R. imtechensis RKJ300T and R. opacus ATCCC 51882T (80.2% dDDH, 77.3–82.9% confidence interval and 90.77% probability of same species) and R. biphenylivorans TG9T and R. pyridinivorans DSM 44555T (88.3% dDDH, 85.9–90.4% confidence interval and 95.2% probability of belonging to the same species), whose species status should be properly revised. In addition, Rhodococcus sp. WAY2 achieved a 70.2% dDDH with Rhodococcus sp. S2-17, with a 67.2%–73% confidence interval and a 78.63% probability of same species. In order to investigate whether the diversity of PGs and species found within the Rhodococcus sequenced genomes had achieved its maximum representation, we conducted rarefaction analyses. The results are shown in Figure3. In both cases, curves are far from reaching an asymptote with 327 genomes sampled, and extrapolation analysis up to 1000 genomes still shows an increment in the number of clusters, which will probably grow to the hundreds in the case of species and above 50 in the case of PGs (Figure3). This is evidence that the diversity exhibited by the Rhodococcus genus will increase as long as new genomes are sequenced and is in agreement with the fact that most of the PGs are composed of only one genome. Microorganisms 2020, 8, 774 5 of 16

▉ ▉ ▉ ▉ ▉ ▉▉▉▉▉▉▉▉▉▉▉▉▉ ▉ ▉ ▉▉▉ ▉▉▉▉▉▉▉

······························ ····························

······························

▉ ··························· ▉ ······································· ···························

▉ ·································· ······························· ▉

································ ···························· ································

▉ ········ ▉ ··························

▉ ·········· ▉ ▉▉▉▉▉▉ ·····························

························· ▉ ▉▉ ▉ ▉ ▉▉ ······························ ▉

▉ ···································· ▉ ▉ ▉▉▉ ······················ ▉

·································· ▉ ▉ ▉ ▉▉ ······················· ▉

▉ ···································· ▉▉ ▉▉ ································· ▉

······································ ▉ ▉ ▉ ▉ ·························· ▉

···································· ▉ ▉

▉ ···························· ▉

··························· ▉ ▉ ▉ ▉ ····························· ▉

··························· ▉ ▉

▉ ▉ ▉ ▉ ·······························

▉ ▉ ▉ ···························· ▉ ▉ ▉ ···························

▉ ▉ Rhodococcus yunnanensis ▉ ···························· ▉ ▉ ▉ ······························· ▉

▉ ▉

····························▉ ▉ ▉ ·····························

▉ Rhodococcus kyotonensis JCM 23211 ▉ ▉

▉ ···························· ▉ ▉ ····························· ▉

▉ ·······················▉ ▉ ▉ ▉ ▉····························· ▉ ▉ ·······················▉ ▉ ▉ ···························▉ ▉ ▉ ▉ ▉ ▉ ▉ ··························· ▉ ▉ ▉ ▉▉ ·······················▉▉ ▉ ▉ ▉ ▉ ·······························▉ ▉ ▉ ·······························▉ ▉

▉ ▉ ▉ sp. 114MFTsu3.1 sp.

Rhodococcus Rhodococcus ▉ ▉

▉ ··················▉

······ ▉

sp. 29MFTsu3.1 sp.

▉ Rhodococcus

sp. 15-1189-1-1a sp.

▉ Rhodococcus T

sp. 14-2470-1b sp. Rhodococcus

Rhodococcus kyotonensis KB10 02-816c fascians Rhodococcus

····························· sp. 14-2686-1-2 sp.

▉ ▉ Rhodococcus ▉ ▉

sp. 06-1059B-a sp. Rhodococcus

▉ ▉

·····························▉ ▉ A73a fascians Rhodococcus

Rhodococcus fascians A22b

▉ Rhodococcus fascians fascians Rhodococcus sp. 14-2470-1a sp. Rhodococcus

▉ ▉ ▉ PG 1

······························ ···············

sp. 15-1154-1 sp.

Rhodococcus Rhodococcus ▉

Rhodococcus T

sp. 06-621-2 sp. ▉ Rhodococcus ▉ Rhodococcus sp. CUA-806

Rhodococcus

·······························▉ sp. 15-649-1-2 sp.

Rhodococcus sp. PAMC28705 Rhodococcus

▉ ▉ ▉

Rhodococcus sp. EPR-157

02-815

▉ fascians Rhodococcus sp. 06-156-3C sp. Rhodococcus ▉ ······························ ▉ ▉

▉ Rhodococcus fascians sp. RS1C4 sp. Rhodococcus Rhodococcus ▉

▉ Rhodococcus ▉

··························· Rhodococcus fascians LMG 3602

▉ Rhodococcus

Rhodococcus ▉

▉ Rhodococcus sp. Eu-32 Rhodococcus sp. 05-2256-B2

Rhodococcus sp. P1Y ▉

▉ sp. 06-156-4a sp. Rhodococcus

······················· Rhodococcus sp. X156 ▉

Rhodococcus sp. 05-2256-B4

Rhodococcus PG 2 - R. fascians

sp. 06-156-3b sp. ▉ Rhodococcus

Rhodococcus fascians Rhodococcus sp. 1163 ▉

sp. 06-156-4C sp. Rhodococcus ▉ ·································▉ Rhodococcus fascians LMG 3623Rhodococcus ▉ Rhodococcus sp. 1168 ▉ ▉ ▉ Rhodococcus ····················· ▉ ▉ ▉ ▉··························· ▉ ▉ ▉ ··························▉ ▉ ▉ ▉ ▉ NBRC 103083 PG 3 ·························· ▉ ▉ Rhodococcus fascians

▉ ▉ 06-418-1B sp. ▉ ▉ Rhodococcus fascians sp. PAMC28707 ▉ ···································▉ ▉ Rhodococcus sp. 05-2254-5 ··································· ▉ ▉ ·······································

▉ ································ sp. 05-2256-B3 ··································· ▉ Rhodococcus sp. OK269 ▉ 06-156-4 sp.

▉ sp. 05-2256-B1 ▉ 06-156-3 sp. ▉······························ Rhodococcus fascians ▉ ▉ PG 4 ▉ ▉ ····································

▉ Rhodococcus fascians 05-561-1 sp. 06-1477-1B ▉ ······························· Rhodococcus sp. Leaf233 A44A ▉ Rhodococcus sp. 05-2255-1e sp. 06-1474-1B ···································· ▉ ▉ Rhodococcus sp. 06-221-2 ···················· ▉ ▉······························ T ▉ ▉ ▉ ····················· ▉ ▉ LMG 3616 T ·················· ······································· ▉ ▉ ▉ Rhodococcus fascians GIC36 ▉ T ▉ PG 5 ▉ ▉··················· Rhodococcus fascians ······································································· ▉ Rhodococcus LMG 3605 ▉ ▉ ▉ Rhodococcus ▉ ···················· sp. 05-2254-1 ······································ ▉ ▉ ▉ ······················································································ ▉ ▉ ▉ ▉ ▉································· ▉ ▉ PG 6 ▉ ▉······························ D188 ▉ ▉ A3b ▉ ▉························ Rhodococcus fascians Rhodococcus 14-2632-D2 Rhodococcus sp. JG-3 ▉ ▉ Rhodococcus fascians ···················· ▉ ▉ Rhodococcus fascians T ▉ ▉ T ▉ ······················ Rhodococcus fascians A78 ··········································· ▉ sp. PML026 Rhodococcus kroppenstedtiiRhodococcus DSM sp. 44908Leaf225 ······························································· ▉ ▉ sp. 06-1460-1B Rhodococcus sp. PBTS 1 PG 7 - R. kyotonensis ▉ Rhodococcus fasciansRhodococcus fascians B3 Rhodococcus sp. Leaf258 ··········································· ▉·························· Rhodococcus corynebacterioidesRhodococcus sp. UNC23MFCrub1.1 NBRC 14404 ········································· ▉ ▉ ▉ ▉ ······································ ▉ ▉ ························· Rhodococcus sp. MEB064 ▉ ▉ Rhodococcus sp. Leaf247 ······························ ▉ ▉ ▉ ▉······················· GIC26 Rhodococcus sp. Leaf7 ▉ ▉ ▉ ▉ sp. PBTS 2 ▉ ▉ PG 8 - R. yunnanensis ························ Rhodococcus Rhodococcus fascians sp. 1R11 ······························ ▉ ▉ RhodococcusRhodococcus fascians Rhodococcus sp. AD45-ID ▉ ▉ Rhodococcus sp. AD45 ▉ ▉ ▉························ Rhodococcus sp. OK551 ······································· ▉ ▉ ▉ Rhodococcus sp. PMG 254 ······························ ▉ ················· Rhodococcus ··································· ▉ PG 9 ▉ ▉ 14-2632c-1 F7 Rhodococcus globerulus WS3306 ▉ Rhodococcus sp. 15-649-2-2 Rhodococcus globerulus NBRC 14531 ▉ ▉ ▉························ Rhodococcus sp. 14-2483-1-1 15-508-1b A2 ▉ Rhodococcus rhodochrous NRRL B-1306 ······························· ▉ ▉ ▉ ▉·························· RhodococcusRhodococcus sp. ACPA4 sp. OK302 ····························· ▉ ▉ Rhodococcus sp. AW25M09 Rhodococcus sp. 4J2A2 ▉ ▉························ Rhodococcus fascians Rhodococcus sp. KBW08 ··········································· ▉ ▉ PG 10 ▉ Rhodococcus sp. 06-412-2Bsp. WWJCD1 Rhodococcus sp. KBS0724 ▉ ▉ A21d2 Rhodococcus erythropolis ACN1 ▉ ▉ ························ 04-516 ▉ ▉ ▉ Rhodococcus erythropolis JCM 3201 ······································· ····························· ▉ ························· Rhodococcus fascians · ······································ ▉ ▉ ▉ sp. 06-412-2C Rhodococcus erythropolis BG43 ▉ ▉ Rhodococcus sp. H-CA8f ▉ ▉ ························· Rhodococcus sp. Leaf278 ▉ PG 11 ▉ Rhodococcus erythropolis R138 ·································· ▉ ▉ ▉ Rhodococcus Rhodococcus sp. 164Chir2E ▉ ······················· ···································· ▉ ▉ ▉ Rhodococcus Rhodococcus erythropolis AV96 ································ ▉ ························· LMG 3625 ▉ ▉ ▉ Rhodococcus sp. 66b ······························ ▉ Rhodococcus Rhodococcus erythropolis PR4 ▉ ▉ PG 12 ▉ ···························· ······················ ▉ Rhodococcus sp.sp. 05-2254-4 06-235-1A Rhodococcus erythropolis VSD3 ······························ ▉ ▉ ▉ A25f ▉ ····························Rhodococcus sp. 05-2254-3 Rhodococcus sp. ARP2 ······················· ▉ ▉ ▉ Rhodococcus sp. AQ5-07 ▉ ························ ▉ ▉ ▉ ···················· Rhodococcus sp.sp. 05-2254-6 05-2254-2 Rhodococcus erythropolis NSX2 PG 13 - R. corynebacteroides ▉ ··················· ▉ ▉ ▉ Rhodococcus sp. LP 11 YM ▉ ·························· sp. 14-1411-2a ▉ ▉ Rhodococcus Rhodococcus sp. LP 3 YM ▉ ▉ ▉ ····················· Rhodococcus Rhodococcus erythropolis MI2 ▉ ▉ Rhodococcus ▉ ▉ Rhodococcus erythropolis S-43 ·································· ▉ PG 14 ▉ ··························· ▉ ▉ sp. EPR-279 Rhodococcus erythropolis 1159 ································ ▉ ▉ Rhodococcus fascians A76 ▉ ▉ ··························· Rhodococcus sp.sp. 14-2483-1-2 15-2388-1-1asp. EPR-147 Rhodococcus erythropolis NCTC8036 ▉ ▉ ▉ ▉ Rhodococcus opacus NRRL B-24011 ················· ▉ ························· ▉ PG 15 ▉ ▉ Rhodococcus Rhodococcus erythropolis ············································CCM2595 ·································· ▉ ▉ ▉ ▉ Rhodococcus rhodochrous ATCC 17895 ·································· 23b-28 ▉ Rhodococcus sp. 05-340-1 ▉ ▉ ▉ Rhodococcus erythropolis XP ▉ ·························· Rhodococcus sp. 05-340-2 ▉ ▉ ▉ Rhodococcus erythropolis DN1 PG 16 - R. globerulus ▉ ············································ ▉ ▉ ··························· ▉ Rhodococcus sp. P27 ······································ ▉ ▉ Rhodococcussp. SBT000017 sp. BS-15 ▉ ▉ Rhodococcus Rhodococcus enclensis ··················· ······················ ▉ ▉ Rhodococcus erythropolis············································ NRRL B-16532 ▉ ▉ Rhodococcus ▉ ······················ ······················· ▉ ▉ PG 17 sp. 06-462-5 Rhodococcus sp.sp. 1139 D-1 ▉ Rhodococcus Rhodococcus sp. 311R ▉ sp. 02-925g ··················· ▉ ▉ ▉ ·························· Rhodococcus sp. 05-2221-1B ▉ sp. 14-2496-1d Rhodococcus ··················· ▉ ▉ ▉ ▉ ·················· Rhodococcus Rhodococcus erythropolis ATCCJCM 15903 9804 ▉ ▉ ▉ ▉ Rhodococcus fascians 05-339-1 Rhodococcus erythropolis IEGM 267 ························ ▉ ▉ PG 18 - R. erythropolis ▉ ····························· sp. 05-339-2 Rhodococcus erythropolis JCM 9803 ▉ ······················· ▉ ▉ ▉ Rhodococcus erythropolis ▉ ························· ▉ ▉ Rhodococcus TUHH-12 ······················ ▉ Rhodococcus erythropolis SK121 ▉ ▉ ························· Rhodococcus sp.sp. 06-418-5 B7740 JCM 6824 ▉ PG 19 - R. marinonascens ▉ ▉ Rhodococcus qingshengii ▉ Rhodococcus sp. 06-470-2 ······························· ▉ ▉ ······················· ▉ Rhodococcus erythropolis ▉ Rhodococcus sp. 06-469-3-2 ▉ ▉ ▉ Rhodococcus erythropolis ········································· B7g ▉ ····················· ▉

▉ PG 20 ▉ Rhodococcus sp. 06-1477-1A Rhodococcus sp. ADH ▉ ·········································· ▉ ····················· ▉ ▉ Rhodococcus sp. 008 ▉ Rhodococcus sp. 15-725-2-2b ··································· ▉

▉ ▉ Rhodococcus sp. EPR-134 ▉ ···················· Rhodococcus sp. 05-2255-3B1 ························ ▉ ▉ ▉ Rhodococcus erythropolis CAS922i ▉ PG 21 ····················· Rhodococcus sp. 05-2255-3C Rhodococcus genus BKS 20-40 ······················ ▉

▉ ▉

Rhodococcus qingshengii ▉

▉ ▉ ▉ ·················· Rhodococcus sp. 05-2255-2A2 Rhodococcus erythropolis JCM 9805 ·····················

Rhodococcus sp. NJ-530 ···································· ▉

▉ PG 22 - R. opacus

327 genomes Rhodococcus sp. YH3-3 ▉

······································ ▉

▉ s Rhodococcus sp. YL-0

s 0.05

r ········································

r

Rhodococcus ▉

sp. YL-1 ▉

e

e ·······································

Rhodococcus sp. BH4 ▉ PG 23 - R. rhodochrous

t

t

Rhodococcus baikonurensis ········································· JCM 18801 ▉

s

s

Rhodococcus ▉

u

u

l l sp. KB6

Rhodococcus qingshengii ········································· ················ ▉ PG 24 ▉

c

c

Rhodococcus qingshengii ▉

CW25 ▉

Rhodococcus qingshengii ▉

s ···························· c

i

Rhodococcus qingshengiiMK1 JCM 15477 ▉

e ······························

▉ i

Rhodococcus sp. AJR001 CS98 ▉ PG 25 - R. coprophilus

m

c

Rhodococcus marinonascens ···························· ▉

o

e Rhodococcus ▉

T

n

··································· ▉ ATCC 700358 (outgroup) Rhodococcus sp. WAY2 ▉ p ···············

sp. WMMA185 PG 26 - R. ruber e ▉

Rhodococcus ▉

S Rhodococcus jostii

g NBRC 14363 ▉

································

Rhodococcus jostii

T o sp. S2-17

l ▉

Nocardia brasiliensis ···································· ·······

Rhodococcus PG 27

y ······································

NCTC 630 Rhodococcus wratislaviensisDSM 44719 C31-06 ▉

T

h

Rhodococcus ▉

sp. PMGNBRC 259 16295

T Rhodococcus koreensis DSM 44498 ······························· ▉ ▉

P

▉ PG 28 - R. triatomae

Rhodococcus T ▉

Rhodococcus opacussp. NCIMBB4 12038

▉ ▉

NBRC 100604 ·························································

Rhodococcus wratislaviensis WS3308 ▉ ▉

PAM1357 Rhodococcus wratislaviensis NCTC 13229

▉ sp. ACS1 ▉

Rhodococcus wratislaviensis ······················ PG 29 - R maanshanensis ▉

T Rhodococcus opacus R7 ·····························

Rhodococcus jostii RHA1 ······································· ▉

Rhodococcus rhodnii ▉

▉ ······································T

······························ Rhodococcus rhodochrous Rhodococcus sp. DK17

···················

Rhodococcus hoagii ▉

▉ Rhodococcus

▉ PG 30 - R. tukisamuensis

················ Rhodococcus kunmingensis DSM 45001 Rhodococcus opacus ▉ ▉

Rhodococcus hoagii PAM1643

Rhodococcus ▉

········

▉ Rhodococcus hoagii PAM2279PAM1533 Rhodococcus

▉ ▉

····················· ▉

Rhodococcus sp. ACPA1 ▉ ······················· Rhodococcus hoagii PAM1340 ······································

sp. JVH1 ▉ PG 31

RhodococcusRhodococcus sp. PMG opacus 084 PD630

▉ NBRC 100605

▉ ·······································

······················

▉ ▉

···············

▉ ·······································

RhodococcusRhodococcus hoagii hoagii DSM PAM1496 20295 Rhodococcus imtechensissp. LB1 RKJ300 ▉ ▉

▉ ······················ Rhodococcus opacus

▉ ········································

Rhodococcus hoagii PAM2288 sp. SC4 ▉

▉ Rhodococcus wratislaviensis IFP 2016 04-OD7 T

▉ Rhodococcus opacus M213 PG 32

······················ Rhodococcus sp. Br-6 ▉

▉ Rhodococcus hoagii Rhodococcus opacus ·········································· ········ ▉

▉ ················· Rhodococcus hoagii PAM1413

▉ Rhodococcus opacus ········································································ ▉

▉ Rhodococcus opacus ATCC 51882

▉ ······················· Rhodococcus hoagii N1301 ▉ ▉

PG 33

▉ ····································· ▉

······················· Rhodococcus hoagii N1288 ▉

▉ Rhodococcus biphenylivorans

▉ ······················· Rhodococcus hoagii 103S ·······································

▉ ▉

Rhodococcus sp. R04 ▉

▉ ·································

Rhodococcus hoagii PAM2276 Rhodococcus pyridinivorans 1CP

▉ ······················· ▉ ▉

▉ Rhodococcus PG 34

Rhodococcus pyridinivorans DSM 44555 ▉ ▉

▉ ·································· Rhodococcus pyridinivorans

Rhodococcus hoagii PAM2287 ····································

▉ 8

Rhodococcus pyridinivorans ▉

··························· ···········································

▉ Rhodococcus hoagii PAM2285 Rhodococcus ▉

Rhodococcus pyridinivorans DSM 44186 ··································

Rhodococcus hoagii PAM2282 ▉

▉ ··························· Rhodococcus pyridinivorans SB3094 T

▉ PG 35

Rhodococcus sp. HS-D2 ▉

▉ ···················· Rhodococcus hoagii PAM1637 ▉

······················· Rhodococcus sp.Rhodococcus IITR03 sp. P52

Rhodococcus sp. 2G ▉ ▉ ▉

Rhodococcus hoagii PAM1475 sp. 852002-51564 SCH6189132-a ▉ ····························· Rhodococcus pyridinivorans ▉ ▉

Rhodococcus hoagii PAM1571 ························ ▉

▉ T T

······················ Rhodococcus hoagii PAM1557 ▉

RhodococcusRhodococcus rhodochrous pyridinivorans PG 36 - R. defluvii ▉ Rhodococcus sp. R1101

··························· ▉ ▉ Rhodococcus ····································

······················ Rhodococcus hoagii PAM1354 RhodococcusRhodococcus rhodochrous rhodochrous ▉

▉ sp. Chr-9 ▉

Rhodococcus hoagii PAM2012 ▉

▉ T

······················· ▉

▉ Rhodococcus rhodochrous J3

Rhodococcus gordoniae ▉ T ▉

Rhodococcus hoagii PAM1216 Rhodococcus ····················· ▉

▉ AK37 TG9

······················· DSM 20307 ▉ ▉

Rhodococcus hoagii PAM1572 Rhodococcus phenolicus DSM 44812 ▉

T Rhodococcus coprophilus ▉

▉ Rhodococcus zopfii T PG 37 - R. agglutinans

Rhodococcus hoagii PAM1593 ▉ ▉ ·······················

▉ 11Y

▉ T

Rhodococcus hoagii PAM1204 ····································ZKA33 ················· ▉

▉ ······················

······················· ▉

Rhodococcus hoagii PAM1600 ▉

GF3 ▉ ▉

······················· ▉

T

▉ sp. BUPNP1 ▉

▉ BKS 15-14

·············································································· ▉

BCP1

▉ ······················· ···································

Rhodococcus hoagii N1295 ▉ PG 38

▉ YF3

▉ Rhodococcus hoagii NCTC 5650 WB1 sp. ············································· P14

▉ Rhodococcus hoagii DSSKP-R-001

······················· ········································ ▉

EsD8 sp.

Rhodococcus hoagii NCTC 1621 ▉

CCTCC AB2014297 ▉

······················· DSM 44675 sp. Q1 ▉

Rhodococcus hoagii PAM1271 Chol-4 ······················· ▉ ▉

T ▉ sp. ENV425 sp.

▉ ·······················

Rhodococcus hoagii PAM1422Rhodococcus sp. OK519 ▉

▉ PG 39 - R. hoagii

▉ ZKA49 ········

sp. NEAU-CX67

······················· ▉

▉ 20-38 BKS

▉ Rhodococcus sp. AG1013 ··················· ··

Rhodococcus sp. C9-28 ······································· ▉

Rhodococcus hoagii NBRC 101255 ▉

·······················▉ Rhodococcus hoagii BKS6-46

▉ Rhodococcus sp. ABRD24

Rhodococcus hoagii PAM2274 ▉

······································· ································KG-16 ▉

▉ Rhodococcus hoagii ATCC 33707 EP4 ▉ NBRC 15591

························ ▉

Rhodococcus defluvii Ca11 Rhodococcus sp. RD6.2 ▉

··················· ···················· ▉ ▉

▉ NBRC 100606 NCTC 10210

Rhodococcus sp. HA99 ▉ PG 40 - R. kunmingensis

sp. M8 sp. Rhodococcus Rhodococcus Rhodococcus

······················ NCTC 13296 ▉

▉ ··························

▉ Rhodococcus

RhodococcusRhodococcus sp. LHW51113 sp. LHW50502 ▉

······················▉ ···················

Rhodococcus sp. OK270 ·························· ▉

Rhodococcus sp. OK611 ▉

▉ ···················

···························· ▉

Rhodococcus sp. JCM 9793 ▉

▉ Rhodococcus sp. JCM 9791 Rhodococcus Rhodococcus ▉

························ Rhodococcus sp. MTM3W5.2 NCTC 10994 ▉ ▉

▉ ▉

YYL ruber Rhodococcus ▉

OA1 ruber Rhodococcus

▉ Rhodococcus ruber

▉ Rhodococcus

Rhodococcus ruber P25 rhodochrous Rhodococcus

·················· Rhodococcus ruber SD3 ▉ PG 41 - R. rhodnii

Rhodococcus triatomae ▉

························▉

Rhodococcus ruber ▉

▉ YC-YT1 ruber Rhodococcus

······················ ▉

▉ IcdP1

Rhodococcus aetherivorans aetherivorans Rhodococcus T

▉ ····················· Rhodococcus sp. UNC363MFTsu5.1

▉ ▉

231 IEGM

ruber Rhodococcus ▉

▉ T

························ aetherivorans Rhodococcus ·········· ▉

T

Rhodococcus ruber ruber Rhodococcus

▉ ············· PG 42

································▉

Rhodococcus agglutinans ▉

T

················ ▉

KG-21 rhodochrous Rhodococcus ▉ ▉

▉ TRN71 rhodochrous Rhodococcus

·

▉ ·············

···························· ▉

Rhodococcus tukisamuensis JCM 11308 T

Rhodococcus triatomae DSM 44892 ▉

··································

Rhodococcus ruber ▉

········· ▉

▉ ·································

ATCC 21198 ATCC rhodochrous Rhodococcus ▉

▉ ▉

···································· ▉

▉ Rhodococcus maanshanensis

▉ ····························

·····························

▉ ▉

······················ ▉

▉ ·····································

▉ ▉

▉ ··········

··················· ▉ ▉ ▉

······· ▉

·····················

▉ ▉

▉ ···································· ▉

······························ ▉ ▉

···························· ▉ ····································

▉ ····································· ▉

·································· ▉

▉ ▉

▉ ·····················

·································· ▉ ▉

▉ ▉

▉ ▉

▉ ▉ ··············· ▉ ▉

▉ ············· ▉ ▉ ································· ▉ ▉

·································· ▉

▉ ▉ ▉ ▉

································· ▉

······························· ▉

········································

················ ····················

▉ ····················

································ ······················· ···························· ▉

▉ ····························· ································

·································

·································

▉ ·································

··························

▉ ▉

▉ ▉

▉ ▉ ▉ ▉

▉ ▉

▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉ ▉

Figure 1. Genome-to-genome blast distance phylogeny (GBDP)-based phylogeny of 327 Rhodococcus genomes. The neighbor-joining tree was built using the GBDP intergenomic distances. Nocardia brasiliensis ATCC 700358 was used as outgroup. Clusters at the species level (inner circle) or phylogenomic groups (PGs, outer circle) are defined by OPTSIL clustering of intergenomic distances. Colors according to PG. Blue, bold and T indicate type strain. Rhodococcus sp. WAY2 is highlighted in yellow and red typing. Microorganisms 2020, 8, 774 6 of 16

(a) Species clustering (b) Groups clustering F = 0.5, distance = 0.036, dDDH% = 70%, clusters = 83 F = 0.5, distance = 0.1395, dDDH% = 29.8%, clusters = 42 1.000

1.00 300 1.00 300 0.995

0.990

0.75 Number of clusters 0.75 Number of clusters 0.0320.0340.0360.0380.040 Clusters 200 200 1.0000 Clusters no. 0.9975 x (MRI) consistency x (MRI) consistency e e 0.50 0.50 0.9950 MRI 0.9925 Species-level 0.9900 100 100 Groups-level 83 0.25 0.25 0.1300 0.13950.1430 Modified Rand Ind

Modified Rand Ind 42

0.00 0 0.00 0

0.000 0.036 0.050 0.100 0.150 0.200 0.0000 0.0500 0.1000 0.13950.1500 0.2000 Distance Threshold T Distance Threshold T

(c) Species-level matrix (d) Groups-level matrix (e) dDDH matrix

From PG 1

To PG 42

distance distance dDDH%

0.000 0.010 0.020 0.030 0.036 0.0000 0.0250 0.0500 0.1000 0.1395 20 40 60 80 100

Figure 2. Clustering analysis of 327 Rhodococcus genomes using a range of distance thresholds T. Total cluster consistency (i.e., MRI = 1) was achieved using an average linkage (i.e., F = 0.5) at both species-level (a) and groups-level clusters (b) compared to the reference partition. Clustering was performed with the OPTSIL software v1.5 [37]. Distance matrices (c,d) and digital DNA–DNA Microorganismshybridization 2020, 8 (dDDH), 774 matrix (e) show these clusters from PG 1 (upper-right) to PG 42 (lower-left). 7 of 16

Figure 3. Interpolation/extrapolationInterpolation/extrapolation rarefaction analysis of the clusters at species and phylogenomic groups levels (left and right respectively),respectively), using 10001000 replicatesreplicates andand aa 95%95% confidenceconfidence interval.interval.

3.2. Phylogeny Based on Single-Copy Proteins The comparison of 237 strains proteomes resulted in a total of 17,258 orthologous groups (OGs). Among these OGs, 212 appeared in all the genomes as single-copy amino acid sequences. These OGs were used to construct a ML phylogenetic tree shown in Figure 4, whose clustering pattern is consistent with a previous phylogenetic analysis also based on amino acid sequences [20]. The same PGs found in the GBDP-based phylogenomic analysis (Figure 1) are also identified with total bootstrap support using amino acid sequences, which validates the genome clustering reported here. Nonetheless, PGs 4, 28, 41, and 42, all composed of single strains, are distant and separated from their closest PGs compared with the GBDP-based tree, probably due to different evolutionary pressure on the core fraction versus the whole genome content. In the case of PGs 41 and 42, composed of R. rhodnii NBRC 100604T and R. rhodochrous NCTC 630, respectively, the high distance in the single-copy amino acid tree is also observed at the genomic level, being the earliest-diverging groups within the Rhodococcus genus (Figure 1). In addition, PG 41 and PG 28 (R. rhodnii NBRC 100604T and R. triatomae DSM 44892T, respectively) are clustered together in the amino acid-based phylogeny, which agrees with a previous report [20]. In the specific case of PG 4 (composed of Rhodococcus sp. X156 genome), the unusually high GC% content (72.2) of this genome could result in a biased codon usage [52], which might explain the differences between the GBDP-based tree and its high divergence in the amino acid-based phylogeny. Aside from these differences, both the GBDP and the amino acid-based analyses show a robust PG identity, maintain the same strain composition, and a similar phyletic pattern.

Microorganisms 2020, 8, 774 7 of 16

3.2. Phylogeny Based on Single-Copy Proteins The comparison of 237 strains proteomes resulted in a total of 17,258 orthologous groups (OGs). Among these OGs, 212 appeared in all the genomes as single-copy amino acid sequences. These OGs were used to construct a ML phylogenetic tree shown in Figure4, whose clustering pattern is consistent with a previous phylogenetic analysis also based on amino acid sequences [20]. The same PGs found in the GBDP-based phylogenomic analysis (Figure1) are also identified with total bootstrap support using amino acid sequences, which validates the genome clustering reported here. Nonetheless, PGs 4, 28, 41, and 42, all composed of single strains, are distant and separated from their closest PGs compared with the GBDP-based tree, probably due to different evolutionary pressure on the core fraction versus the whole genome content. In the case of PGs 41 and 42, composed of R. rhodnii NBRC 100604T and R. rhodochrous NCTC 630, respectively, the high distance in the single-copy amino acid tree is also observed at the genomic level, being the earliest-diverging groups within the Rhodococcus genus (Figure1). In addition, PG 41 and PG 28 ( R. rhodnii NBRC 100604T and R. triatomae DSM 44892T, respectively) are clustered together in the amino acid-based phylogeny, which agrees with a previous report [20]. In the specific case of PG 4 (composed of Rhodococcus sp. X156 genome), the unusually high GC% content (72.2) of this genome could result in a biased codon usage [52], which might explain the differences between the GBDP-based tree and its high divergence in the amino acid-based phylogeny. Microorganisms 2020, 8, 774 8 of 16

Figure 4. Maximum-likelihoodFigure 4. Maximum-likelihood phylogenetic phylogenetic treetree of the the RhodococcusRhodococcus genusgenus based basedon 212 single-copy on 212 single-copy amino acid sequences. PGs according to those identified in this study. Grey dots indicate PGs amino acidcomposed sequences. of multiple PGs according genomes. toBootstrap those identifiedsupport is indicated in this study. above/below Grey dotsbranches, indicate not shown PGs composed of multipleinside genomes. PGs. Bootstrap support is indicated above/below branches, not shown inside PGs.

3.3. Genome Fractions of the Rhodococcus Genus The orthologous groups identified by the comparative analysis were used to identify the core genome, the pangenome and the strain-specific genome fractions. The core genome of the Rhodococcus genus, which consists of those OGs which are represented in all genomes (“hard” core), is composed of only 381 OGs (Figure 5a). However, given the number of genomes included in the study, a “soft core” where a high percentage of genomes are represented, rather than the 100%, is probably more accurate. Considering a presence in at least 99.16% of the genomes, we obtain a soft core of 1253 OGs that shifts to 1493 OGs when fixing the threshold to 98.73% of genomes (Figure 5a). Although there is no previous attempt to analyze the core genome of the Rhodococcus genus, but rather of certain groups [23,24], the number of “soft core” OGs is similar to that of other Actinobacteria genera. For example, analysis of 21 Mycobacterium genomes resulted in a core genome composed of

Microorganisms 2020, 8, 774 8 of 16

Aside from these differences, both the GBDP and the amino acid-based analyses show a robust PG identity, maintain the same strain composition, and a similar phyletic pattern.

3.3. Genome Fractions of the Rhodococcus Genus The orthologous groups identified by the comparative analysis were used to identify the core Microorganisms 2020, 8, 774 9 of 16 genome, the pangenome and the strain-specific genome fractions. The core genome of the Rhodococcus genus,ca. 1250 which OGs consists [53], while of those 17 OGsStreptomyces which are species represented (different in all genomesbacterial (“hard”order than core), Rhodococcus is composed and of onlyMycobacterium 381 OGs (Figure) present5a). a However, core of 2018 given OGs the number[54]. Core-genome of genomes size included depending in the study,on the a “softnumber core” of wheregenomes a high sampled, percentage as represented of genomes in areFigure represented, 5a, show rathers a rapid than decrease the 100%, in the is probably number moreof OGs accurate. within Consideringthe first randomly a presence sampled in at genomes, least 99.16% and of an the asympt genomes,ote is we almost obtain reached a soft core when of considering 1253 OGs that the shifts total to237 1493 genomes OGs when used fixingin the thestudy. threshold to 98.73% of genomes (Figure5a).

FigureFigure 5.5.Genome Genome fractions fractions of theof Rhodococcusthe Rhodococcusgenus. genus. Core genome Core genome (a), strain-specific (a), strain-specific (b), and pangenome (b), and (pangenomec) analysis representing (c) analysis mean representing values (line) mean and values Q1 and (line) Q3 and quantiles Q1 and (shadow) Q3 quantiles over 300 (shadow) replicates over of 237300 randomlyreplicates sampledof 237 randomly genomes. sampled In the case genomes. of core genome, In the case values of atcore a different genome, percentage values at ofa genomesdifferent sampledpercentage is indicated of genomes with sampled dashed lines.is indicated Red circles with inda (a)shed and lines. (c) indicate Red circles the maximum in (a) and number (c) indicate of OGs the achievedmaximum (below number/above of OGs the red achieved circle). (below/above the red circle).

AlthoughThe strain-specific there is nogenome previous fraction, attempt represented to analyze as thea function core genome of the ofnumber the Rhodococcus of new OGsgenus, over butsequentially rather of certain added groups genomes [23 ,24(Figure], the number5b), also of show “softs core” a rapid OGs reduction is similar towithin that of the other firstActinobacteria 50 sampled genera.genomes, For and example, then slowly analysis decreases of 21 toMycobacterium reach an averagegenomes of 33 OGs resulted within in a237 core genomes. genome This composed implies ofthat ca. within 1250 OGs50 genomes, [53], while most 17 Streptomyces of the Rhodococcusspecies (disharedfferent genetic bacterial content order is than achievedRhodococcus and moreand Mycobacteriumgenomes would) present only add a corespecific of 2018 sequences, OGs [54 which]. Core-genome is congruent size with depending the 42 PGs on identified the number in this of study. However, the fact that on average each Rhodococcus adds 33 specific OGs and the high standard deviation observed in the strain-specific curve (Figure 5b) indicate that more genomes will keep increasing the overall genetic diversity of Rhodococcus. This is further evidenced in the pangenome curve, which reaches 26,080 OGs within the 237 sampled genomes (Figure 5c) and keeps a positive slope, being an “open” pangenome. The pangenome size of the Rhodococcus genus is similar to that reported in Mycobacterium and Streptomyces, composed of ca. 20,000 and 34,592 OGs, respectively [53,54].

Microorganisms 2020, 8, 774 9 of 16 genomes sampled, as represented in Figure5a, shows a rapid decrease in the number of OGs within the first randomly sampled genomes, and an asymptote is almost reached when considering the total 237 genomes used in the study. The strain-specific genome fraction, represented as a function of the number of new OGs over sequentially added genomes (Figure5b), also shows a rapid reduction within the first 50 sampled genomes, and then slowly decreases to reach an average of 33 OGs within 237 genomes. This implies that within 50 genomes, most of the Rhodococcus shared genetic content is achieved and more genomes would only add specific sequences, which is congruent with the 42 PGs identified in this study. However, the fact that on average each Rhodococcus adds 33 specific OGs and the high standard deviation observed in the strain-specific curve (Figure5b) indicate that more genomes will keep increasing the overall genetic diversity of Rhodococcus. This is further evidenced in the pangenome curve, which reaches 26,080 OGs within the 237 sampled genomes (Figure5c) and keeps a positive slope, being an “open” pangenome. The pangenome size of the Rhodococcus genus is similar to that reported in Mycobacterium and Streptomyces, composed of ca. 20,000 and 34,592 OGs, respectively [53,54].

3.4. Distribution of PAHs and Alkane Degradation Genes Rhodococcus strains have the ability of degrading multiple organic compounds, including PAHs, dioxin and dioxin-like compounds, and different chain-length n-alkanes [10–12]. Degradation of aromatic compounds is commonly carried out by Rieske 2Fe-2S dioxygenase systems, including those involved in biphenyl/PCBs, ethylbenzene, and naphthalene degradation (bph, etb and nah gene clusters), which present a wide range of substrate specificity and have been reported in multiple Rhodococcus strains [11,33,55,56]. Rhodococcus genomes can simultaneously possess several of these systems [13,32]. Among them, Rhodococcus sp. WAY2 contains 5 different clusters putatively involved in the degradation of many aromatic compounds and a tmo gene cluster putatively involved in the conversion of toluene into p-cresol [32,57]. The OGs, which include the genes of these clusters in WAY2, were searched to address their distribution within the Rhodococcus genus and are shown in Figure6a. Alpha subunits of these Rieske 2Fe-2S dioxygenases (BphA1a, BphA1b, EtbA1a, EtbA1b, and NahA1) are widely distributed within the genus PGs. However, they are missing from PGs 3, 20, 28, 33, and 34, and partially present in PGs 12, 13, and 18. Interestingly, the beta subunits of these dioxygenases (BphA2a, BphA2b, EtbA2a, EtbA2b, and NahA2a) have a more discrete distribution, being only present in 15 PGs (5, 6, 16, 17, 21, 22, 26, 30, 31, 35, 36, 37, 38, 39, and 41) and partially present in another 4 PGs (2, 14, 23 and 29), which also harbor the alpha subunits (Figure6a). These PGs contain known degraders of aromatic compounds, including R. jostii RHA1 (PG 22) [13] and Rhodococcus sp. WAY2 (PG 21) [32]. This finding suggests that the degradation of aromatic compounds might be restricted to these PGs, at least of those compounds whose biodegradation is initiated by Rieske ring-hydroxylating dioxygenases of the orthologous group analyzed. On the other hand, the tmo gene cluster involved in the conversion of toluene to p-cresol [57] has a more limited distribution, being only present in PGs 42 and 21 and partially present in PGs 16 and 22 (Figure6a), suggesting a specialized and distinctive metabolism of aromatic compounds in strains from these groups. Aliphatic compounds, on the other hand, can be degraded by several different pathways [58]. The first step is a monooxygenation catalyzed by soluble or particulate methane monooxygenases (sMMO or pMMO, respectively) for short chain n-alkanes [59,60], or alkane monooxygenases (AlkB) and long-chain alkane monooxygenases (LadA) for middle and long-chain n-alkanes, respectively [58,61–63]. The distribution of orthologous sequences of these genes and gene clusters within Rhodococcus PGs shows an interesting pattern (Figure6b). AlkB and LadA are found in most of the PGs (except PG 33, which does not harbor any of these genes), which suggests that almost all Rhodococcus strains could putatively degrade middle to long-chain n-alkanes. Conversely, sMMO subunits are present in a more limited number of groups (PGs 8, 2, 7, 9, 17, 20, 21, 22, 24, 26, and 42). Interestingly, mmoC, which encodes the iron–sulfur component of sMMO [64], is found in other groups that do not contain the remaining sMMO subunits (Figure6b). This could be explained by similar homology to other Microorganisms 2020, 8, 774 10 of 16 iron–sulfur electron transfer systems. Surprisingly, the pMMO system reported in Rhodococcus sp. WAY2 [32] is not found in any other PG or genome within the Rhodococcus genus, being a unique and distinctive feature of WAY2 (Figure6b). It has been reported that pMMO has a narrow substrate specificity, oxidizing n-alkanes up to C5, preferentially at the C2 position [65], and it has been found in several putative aerobic methanotrophic [66]. The absence of this cluster in other rhodococci could imply a horizontal transfer event and a novel catabolic acquisition that distinguish this strain from any other Rhodococcus, although further analyses are required to prove this hypothesis and test its functionality in Rhodococcus sp. WAY2. Microorganisms 2020, 8, 774 11 of 16

FigureFigure 6. Distribution6. Distribution of orthologousof orthologous groups groups (OGs) (OGs) involved involved in aromatic in aromatic (a) and ( aliphatica) and aliphatic (b) compound (b) compound degradation in Rhodococcus PGs. Color scale according to the fraction of genomes within degradation in Rhodococcus PGs. Color scale according to the fraction of genomes within each PG with each PG with the OG present. Colored boxes below enzyme names according to their cluster pattern the OG present. Colored boxes below enzyme names according to their cluster pattern in Rhodococcus in Rhodococcus sp. WAY2 (PG 21, highlighted in yellow and red typing). sp. WAY2 (PG 21, highlighted in yellow and red typing).

3.5.Nonetheless, Diversity of Rieske although, 2Fe-2S in thisDioxygenases study, only among the distribution Rhodococcus of Genomes the main traits reported in Rhodococcus sp. WAY2 haveThe diversity been explored, of Rieske other 2Fe-2S traits dioxygenases not found in has WAY2 been couldpreviously also showanalyzed, a distinctive either in patternwell-known among theand rest characterized of PGs in the sequences genus, which from require different further taxa [67,68] analysis. or in environmental samples [69,70]. We used the orthologous group containing these dioxygenases in the Rhodococcus genus to construct a 3.5.phylogenetic Diversity of Riesketree, to 2Fe-2S assess Dioxygenases their diversity among within Rhodococcus the genus. Genomes The orthologous group of these dioxygenasesThe diversity contains of Rieske 567 2Fe-2Ssequences, dioxygenases of which 339 has are been not previouslyidentical and analyzed, were used either to construct in well-known the andphylogeny characterized (Figure sequences 7a). These from sequences different taxainclud [67e,68 biphenyl] or in environmental 2,3-dioxygenases, samples naphthalene [69,70]. We 1,2- used dioxygenases, ethylbenzene 2,3-dioxygenases, phthalate 4,5-dioxygenases, 3-phenylpropionate the orthologous group containing these dioxygenases in the Rhodococcus genus to construct a dioxygenases, benzoate 1,2-dioxygenases, and other dioxygenases with known substrates (Figure 7a). Surprisingly most of the sequences constitute large and very diverse groups without known function/substrate annotated to date. All the sequences involved in the degradation of peripheral substrates (biphenyl, naphthalene, and ethylbenzene, among others) clustered together, along with certain groups of proteins with unknown substrate. Other groups of sequences involved in central aromatic metabolism (benzoate) or central nodes in aromatic degradation pathways (p-cumate, anthranilate, and terephthalate) also form distinct clusters. From the total of 339 unique sequences analyzed, the substrates of more than 200 remain unknown.

Microorganisms 2020, 8, 774 11 of 16 phylogenetic tree, to assess their diversity within the genus. The orthologous group of these dioxygenases contains 567 sequences, of which 339 are not identical and were used to construct the phylogeny (Figure7a). These sequences include biphenyl 2,3-dioxygenases, naphthalene 1,2-dioxygenases, ethylbenzene 2,3-dioxygenases, phthalate 4,5-dioxygenases, 3-phenylpropionate dioxygenases, benzoate 1,2-dioxygenases, and other dioxygenases with known substrates (Figure7a). Surprisingly most of the sequences constitute large and very diverse groups without known function/substrate annotated to date. All the sequences involved in the degradation of peripheral substrates (biphenyl, naphthalene, and ethylbenzene, among others) clustered together, along with certain groups of proteins with unknown substrate. Other groups of sequences involved in central aromatic metabolism (benzoate) or central nodes in aromatic degradation pathways (p-cumate, anthranilate, and terephthalate) also form distinct clusters. From the total of 339 unique sequences analyzed, the substrates of more than 200 remain unknown. Microorganisms 2020, 8, 774 12 of 16

FigureFigure 7. Rieske 7. Rieske 2Fe-2S 2Fe-2S dioxygenase dioxygenase phylogenetic phylogenetic tree tree (a(a)) andand abundanceabundance of of or orthologousthologous sequences sequences (b) among(b) Rhodococcusamong RhodococcusPGs. PGs. The The unrooted unrooted maximum-likelihood maximum-likelihood tree tree was was constructed constructed with with 339 unique 339 unique sequences found in the OG with Rieske 2Fe-2S dioxygenases. Sequences of dioxygenases (DO) with sequences found in the OG with Rieske 2Fe-2S dioxygenases. Sequences of dioxygenases (DO) with known function/substrate are highlighted in blue, and those without known function/substrate are known function/substrate are highlighted in blue, and those without known function/substrate are highlighted in grey. Dots indicate bootstrap support higher than 75%. The number of orthologous highlightedRieske 2Fe-2S in grey. dioxygenases Dots indicate found bootstrap in each supportanalyzed higherRhodococcus than genome 75%. The arenumber represented of orthologous in the Rieskebarplot. 2Fe-2S dioxygenases found in each analyzed Rhodococcus genome are represented in the barplot.

On theOn the other other hand, hand, the the number number ofof orthologsorthologs found found in in each each of ofthe the genomes genomes analyzed analyzed differs di ffers widelywidely (Figure (Figure7b). 7b). The The PGs PGs that that harbor harbor the the highest highest number of of Rieske Rieske 2Fe-2S 2Fe-2S dioxygenases dioxygenases are those are those of knownof known degraders degraders of of aromatic aromatic compounds.compounds. For For example, example, R. jostiiR.jostii RHA1RHA1 [13], R. [13 opacus], R. strains opacus B4strains B4 [71[71],], and and R7 R7 [72[72],], allall includedincluded in in PG PG 22, 22, contain contain 12, 12, 11 11,, and and 9 Rieske 9 Rieske 2Fe-2S 2Fe-2S dioxygenase dioxygenase orthologs, orthologs, respectively. Similarly, Rhodococcus sp. WAY2 [32] and Rhodococcus sp. S2-17, forming PG 21, contain 13 and 15 of these dioxygenases, respectively (Figure 7b), S2-17 being the strain with the highest number of Rieske 2Fe-2S dioxygenases identified. Therefore, there are probably novel functions and substrates that remain undiscovered within the large number of uncharacterized dioxygenases present in Rhodococcus genomes, which is consistent with the diversity of novel and not functionally characterized dioxygenases usually found in environmental studies [69,70].

Microorganisms 2020, 8, 774 12 of 16 respectively. Similarly, Rhodococcus sp. WAY2 [32] and Rhodococcus sp. S2-17, forming PG 21, contain 13 and 15 of these dioxygenases, respectively (Figure7b), S2-17 being the strain with the highest number of Rieske 2Fe-2S dioxygenases identified. Therefore, there are probably novel functions and substrates that remain undiscovered within the large number of uncharacterized dioxygenases present in Rhodococcus genomes, which is consistent with the diversity of novel and not functionally characterized dioxygenases usually found in environmental studies [69,70].

4. Conclusions The diversity of the Rhodococcus genus is reflected in the 42 phylogenomic groups (PGs) and 83 species clusters that are identified within more than 300 sequenced genomes. The number of PGs and species are likely to increase with the sequencing of more strains. Comparative genomic analysis shows a high degree of genetic diversity reflected in a small core genome of 381 orthologous groups and a large open pangenome of 26,080 PGs. The distribution of biodegradative traits among Rhodococcus PGs shows that although many of the Rhodococcus strains could potentially catabolize aromatic and aliphatic compounds, short-chain n-alkanes biodegradation is limited to a certain number of groups, and specialized metabolism of these alkanes is present in Rhodococcus sp. WAY2. Finally, the high number and diversity of Rieske 2Fe-2S dioxygenases with unknown substrate among rhodococci genomes makes the discovery of novel aromatic compounds’ degradation a possibility that requires further exploration.

Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2607/8/5/774/s1, Table S1: List of Rhodococcus genomes used in this study, Table S2: GGDC intergenomic distances of the pairwise Rhodococcus genome comparisons, File S1: Orthologous groups identified among the Rhodococcus genomes, and File S2: R code used to obtain the genome fractions. Author Contributions: Conceptualization, D.G.-S., M.R.-N., M.M., and R.R.; methodology, D.G.-S. and M.R.-N.; software, D.G.-S. and M.R.-N.; validation, D.G.-S., M.R.-N., M.M., and R.R.; formal analysis, D.G.-S. and M.R.-N.; investigation, D.G.-S. and M.R.-N.; resources, D.G.-S. and M.R.-N.; data curation, D.G.-S. and M.R.-N.; writing—original draft preparation, D.G.-S.; writing—review and editing, D.G.-S., M.R.-N., M.M., and R.R.; visualization, D.G.-S.; supervision, M.R.-N., M.M., and R.R.; project administration, M.M. and R.R.; funding acquisition, M.M. and R.R. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by GREENER-H2020 (EU), grant number 826312 and MICINN/FEDER EU, grant number RTI2018-0933991-B-I00. D.G.-S. was granted by the MECD FPU fellowship program, grant number FPU14/03965. Acknowledgments: We thank Centro de Computación Científica (CCC) of the Universidad Autónoma de Madrid for giving us access to their computational resources. Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

1. Helmke, E.; Weyland, H. Rhodococcus marinonascens sp. nov., an actinomycete from the sea. Int. J. Syst. Evol. Microbiol. 1984, 34, 127–138. [CrossRef] 2. Margesin, R.; Labbe, D.; Schinner, F.; Greer, C.; Whyte, L. Characterization of hydrocarbon-degrading microbial populations in contaminated and pristine alpine soils. Appl. Environ. Microbiol. 2003, 69, 3085–3092. [CrossRef][PubMed] 3. Ryu, H.-W.; Joo, Y.-H.; An, Y.-J.; Cho, K.-S. Isolation and characterization of psychrotrophic and halotolerant Rhodococcus sp. YHLT-2. J. Microbiol. Biotechnol. 2006, 16, 605–612. 4. Adnani, N.; Braun, D.R.; McDonald, B.R.; Chevrette, M.G.; Currie, C.R.; Bugni, T.S. Complete genome sequence of Rhodococcus sp. strain WMMA185, a marine sponge-associated bacterium. Genome Announc. 2016, 4, e01406–e01416. [CrossRef] 5. Yassin, A. Rhodococcus triatomae sp. nov., isolated from a blood-sucking bug. Int. J. Syst. Evol. Microbiol. 2005, 55, 1575–1579. [CrossRef] Microorganisms 2020, 8, 774 13 of 16

6. Giguère, S.; Cohen, N.; Keith Chaffin, M.; Hines, S.; Hondalus, M.; Prescott, J.; Slovis, N. Rhodococcus equi: Clinical Manifestations, Virulence, and Immunity. J. Vet. Intern. Med. 2011, 25, 1221–1230. [CrossRef] 7. Prescott, J.F. Rhodococcus equi: An animal and human pathogen. Clin. Microbiol. Rev. 1991, 4, 20–34. [CrossRef] 8. Cornelis, K.; Ritsema, T.; Nijsse, J.; Holsters, M.; Goethals, K.; Jaziri, M. The plant pathogen Rhodococcus fascians colonizes the exterior and interior of the aerial parts of plants. Mol. Plant-Microbe Interact. 2001, 14, 599–608. [CrossRef] 9. Goethals, K.; Vereecke, D.; Jaziri, M.; Van Montagu, M.; Holsters, M. Leafy gall formation by Rhodococcus fascians. Annu. Rev. Phytopathol. 2001, 39, 27–52. [CrossRef] 10. De Carvalho, C.C.; Parreño-Marchante, B.; Neumann, G.; Da Fonseca, M.M.R.; Heipieper, H.J. Adaptation of Rhodococcus erythropolis DCL14 to growth on n-alkanes, alcohols and terpenes. Appl. Microbiol. Biotechnol. 2005, 67, 383–388. [CrossRef] 11. Iwasaki, T.; Takeda, H.; Miyauchi, K.; Yamada, T.; Masai, E.; Fukuda, M. Characterization of two biphenyl dioxygenases for biphenyl/PCB degradation in a PCB degrader, Rhodococcus sp. strain RHA1. Biosci. Biotechnol. Biochem. 2007, 71, 993–1002. [CrossRef] 12. Song, X.; Xu, Y.; Li, G.; Zhang, Y.; Huang, T.; Hu, Z. Isolation, characterization of Rhodococcus sp. P14 capable of degrading high-molecular-weight polycyclic aromatic hydrocarbons and aliphatic hydrocarbons. Mar. Pollut. Bull. 2011, 62, 2122–2128. [CrossRef][PubMed] 13. McLeod, M.P.; Warren, R.L.; Hsiao, W.W.; Araki, N.; Myhre, M.; Fernandes, C.; Miyazawa, D.; Wong, W.; Lillquist, A.L.; Wang, D. The complete genome of Rhodococcus sp. RHA1 provides insights into a catabolic powerhouse. Proc. Natl. Acad. Sci. USA 2006, 103, 15582–15587. [CrossRef][PubMed] 14. Lee, S.D.; Kim, Y.-J.; Kim, I.S. Rhodococcus subtropicus sp. nov., a new actinobacterium isolated from a cave. Int. J. Syst. Evol. Microbiol. 2019, 69, 3128–3134. [CrossRef] 15. Silva, L.J.; Souza, D.T.; Genuario, D.B.; Hoyos, H.A.V.; Santos, S.N.; Rosa, L.H.; Zucchi, T.D.; Melo, I.S. Rhodococcus psychrotolerans sp. nov., isolated from rhizosphere of Deschampsia antarctica. Antonie Van Leeuwenhoek 2018, 111, 629–636. [CrossRef][PubMed] 16. Wang, L.; Zhang, L.; Zhang, X.; Zhang, S.; Yang, L.; Yuan, H.; Chen, J.; Liang, C.; Huang, W.; Liu, J. Rhodococcus daqingensis sp. nov., isolated from petroleum-contaminated soil. Antonie van Leeuwenhoek 2019, 112, 695–702. [CrossRef] 17. Kämpfer, P.; Dott, W.; Martin, K.; Glaeser, S.P. Rhodococcus defluvii sp. nov., isolated from wastewater of a bioreactor and formal proposal to reclassify [Corynebacterium hoagii] and Rhodococcus equi as Rhodococcus hoagii comb. nov. Int. J. Syst. Evol. Microbiol. 2014, 64, 755–761. [CrossRef] 18. Tindall, B. A note on the genus name Rhodococcus Zopf 1891 and its homonyms. Int. J. Syst. Evol. Microbiol. 2014, 64, 1062–1064. [CrossRef] 19. Jones, A.; Sutcliffe, I.; Goodfellow, M. Proposal to replace the illegitimate genus name Prescottia Jones et al. 2013 with the genus name Prescottella gen. nov. and to replace the illegitimate combination Prescottia equi Jones et al. 2013 with Prescottella equi comb. nov. Antonie van Leeuwenhoek 2013, 103, 1405–1407. [CrossRef] 20. Sangal, V.; Goodfellow, M.; Jones, A.L.; Seviour, R.J.; Sutcliffe, I.C. Refined Systematics of the Genus Rhodococcus Based on Whole Genome Analyses. In Biology of Rhodococcus; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1–21. 21. Tindall, B. The correct name of the taxon that contains the type strain of Rhodococcus equi. Int. J. Syst. Evol. Microbiol. 2014, 64, 302–308. [CrossRef] 22. Parte, A.C. LPSN—list of prokaryotic names with standing in nomenclature. Nucleic Acids Res. 2014, 42, D613–D616. [CrossRef][PubMed] 23. Orro, A.; Cappelletti, M.; D’Ursi, P.; Milanesi, L.; Di Canito, A.; Zampolli, J.; Collina, E.; Decorosi, F.; Viti, C.; Fedi, S. Genome and phenotype microarray analyses of Rhodococcus sp. BCP1 and Rhodococcus opacus R7: Genetic determinants and metabolic abilities with environmental relevance. PLOS ONE 2015, 10, e0139467. [CrossRef][PubMed] 24. Anastasi, E.; MacArthur, I.; Scortti, M.; Alvarez, S.; Giguère, S.; Vázquez-Boland, J.A. Pangenome and phylogenomic analysis of the pathogenic actinobacterium Rhodococcus equi. Genome Biol. Evol. 2016, 8, 3140–3148. [CrossRef][PubMed] 25. Creason, A.L.; Davis, E.W.; Putnam, M.L.; Vandeputte, O.M.; Chang, J.H. Use of whole genome sequences to develop a molecular phylogenetic framework for Rhodococcus fascians and the Rhodococcus genus. Front. Plant Sci. 2014, 5, 406. [CrossRef] Microorganisms 2020, 8, 774 14 of 16

26. Gürtler, V.; Mayall, B.C.; Seviour, R. Can whole genome analysis refine the taxonomy of the genus Rhodococcus? FEMS Microbiol. Rev. 2004, 28, 377–403. [CrossRef] 27. Richter, M.; Rosselló-Móra, R. Shifting the genomic gold standard for the prokaryotic species definition. Proc. Natl. Acad. Sci. USA 2009, 106, 19126–19131. [CrossRef] 28. Meier-Kolthoff, J.P.; Auch, A.F.; Klenk, H.-P.; Göker, M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinform. 2013, 14, 60. [CrossRef] 29. Garrido-Sanz, D.; Meier-Kolthoff, J.P.; Göker, M.; Martin, M.; Rivilla, R.; Redondo-Nieto, M. Genomic and genetic diversity within the Pseudomonas fluorescens complex. PLOS ONE 2016, 11, e0150183. [CrossRef] 30. Garrido-Sanz, D.; Redondo-Nieto, M.; Mongiardini, E.; Blanco-Romero, E.; Durán, D.; Quelas, J.I.; Martin, M.; Rivilla, R.; Lodeiro, A.R.; Althabegoiti, M.J. Phylogenomic analyses of Bradyrhizobium reveal uneven distribution of the lateral and subpolar flagellar systems, which extends to Rhizobiales. Microorganisms 2019, 7, 50. [CrossRef] 31. Garrido-Sanz, D.; Manzano, J.; Martín, M.; Redondo-Nieto, M.; Rivilla, R. Metagenomic analysis of a biphenyl-degrading soil bacterial consortium reveals the metabolic roles of specific populations. Front. Microbiol. 2018, 9, 232. [CrossRef] 32. Garrido-Sanz, D.; Sansegundo-Lobato, P.; Redondo-Nieto, M.; Suman, J.; Cajthaml, T.; Blanco-Romero, E.; Martin, M.; Uhlik, O.; Rivilla, R. Analysis of the biodegradative and adaptive potential of the novel polychlorinated biphenyl degrader Rhodococcus sp. WAY2 revealed by its complete genome sequence. Microb. Genom. 2020.[CrossRef][PubMed] 33. Kimura, N.; Kitagawa, W.; Mori, T.; Nakashima, N.; Tamura, T.; Kamagata, Y. Genetic and biochemical characterization of the dioxygenase involved in lateral dioxygenation of dibenzofuran from Rhodococcus opacus strain SAO101. Appl. Microbiol. Biotechnol. 2006, 73, 474–484. [CrossRef][PubMed] 34. NCBI ftp Server. Available online: ftp://ftp.ncbi.nlm.nih.gov (accessed on 1 June 2019). 35. Genome-to-genome Distance Calculator (GGDC) 2.1. Available online: http://ggdc.dsmz.de/ggdc.php (accessed on 1 July 2019). 36. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [CrossRef][PubMed] 37. Göker, M.; García-Blázquez, G.; Voglmayr, H.; Tellería, M.T.; Martín, M.P. Molecular taxonomy of phytopathogenic fungi: A case study in Peronospora. PLoS ONE 2009, 4, e6319. [CrossRef] 38. Meier-Kolthoff, J.P.; Hahnke, R.L.; Petersen, J.; Scheuner, C.; Michael, V.; Fiebig, A.; Rohde, C.; Rohde, M.; Fartmann, B.; Goodwin, L.A. Complete genome sequence of DSM 30083 T, the type strain (U5/41 T) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy. Stand. Genom. Sci. 2014, 9, 2. [CrossRef] 39. Hsieh, T.C.; Ma, K.H.; Chao, A. iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods Ecol. Evol. 2016, 7, 1451–1456. [CrossRef] 40. Emms, D.M.; Kelly, S. OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015, 16, 157. [CrossRef] 41. Buchfink, B.; Xie, C.; Huson, D.H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 2015, 12, 59. [CrossRef] 42. Enright, A.J.; Van Dongen, S.; Ouzounis, C.A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002, 30, 1575–1584. [CrossRef] 43. Wickham, H. ggplot2. Wiley Interdisciplinary Reviews. Comput. Stat. 2011, 3, 180–185. [CrossRef] 44. Kolde, R.; Kolde, M.R. Package ‘pheatmap’. R Package 2015, 1. Available online: https://cran.r-project.org/ web/packages/pheatmap/ (accessed on 1 November 2019). 45. Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T.J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 2011, 7.[CrossRef] 46. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000, 17, 540–552. [CrossRef] 47. Stamatakis, A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014, 30, 1312–1313. [CrossRef] 48. Le, S.Q.; Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 2008, 25, 1307–1320. [CrossRef][PubMed] Microorganisms 2020, 8, 774 15 of 16

49. Stamatakis, A.; Hoover, P.; Rougemont, J. A rapid bootstrap algorithm for the RAxML web servers. Syst. Biol. 2008, 57, 758–771. [CrossRef][PubMed] 50. Pattengale, N.D.; Alipour, M.; Bininda-Emonds, O.R.; Moret, B.M.; Stamatakis, A. How many bootstrap replicates are necessary? J. Comput. Biol. 2010, 17, 337–354. [CrossRef][PubMed] 51. Miller, M.A.; Pfeiffer, W.; Schwartz, T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In Proceedings of the 2010 Gateway Computing Environments Workshop (GCE), New Orleans, LA, USA, 14 November 2010; pp. 1–8. 52. Li, J.; Zhou, J.; Wu, Y.; Yang, S.; Tian, D. GC-content of synonymous codons profoundly influences amino acid usage. G3: Genes, Genomes, Genet. 2015, 5, 2027–2036. [CrossRef][PubMed] 53. Zakham, F.; Aouane, O.; Ussery, D.; Benjouad, A.; Ennaji, M.M. Computational genomics-proteomics and Phylogeny analysis of twenty one mycobacterial genomes (Tuberculosis & non Tuberculosis strains). Microbial Inform. Exp. 2012, 2, 7. 54. Kim, J.-N.; Kim, Y.; Jeong, Y.; Roe, J.-H.; Kim, B.-G.; Cho, B.-K. Comparative genomics reveals the core and accessory genomes of Streptomyces species. J. Microbiol. Biotechnol. 2015, 25, 1599–1605. [CrossRef] 55. Patrauchan, M.A.; Florizone, C.; Eapen, S.; Gomez-Gil, L.; Sethuraman, B.; Fukuda, M.; Davies, J.; Mohn, W.W.; Eltis, L.D. Roles of ring-hydroxylating dioxygenases in styrene and benzene catabolism in Rhodococcus jostii RHA1. J. Bacteriol. 2008, 190, 37–47. [CrossRef][PubMed] 56. Resnick, S.; Lee, K.; Gibson, D. Diverse reactions catalyzed by naphthalene dioxygenase from Pseudomonas sp strain NCIB 9816. J. Ind. Microbiol. 1996, 17, 438–457. 57. Yen, K.-M.; Karl, M.R.; Blatt, L.M.; Simon, M.J.; Winter, R.B.; Fausset, P.R.; Lu, H.S.; Harcourt, A.A.; Chen, K.K. Cloning and characterization of a Pseudomonas mendocina KR1 gene cluster encoding toluene-4-monooxygenase. J. Bacteriol. 1991, 173, 5315–5327. [CrossRef][PubMed] 58. Ji, Y.; Mao, G.; Wang, Y.; Bartlam, M. Structural insights into diversity and n-alkane biodegradation mechanisms of alkane hydroxylases. Front. Microbiol. 2013, 4, 58. [CrossRef][PubMed] 59. Elliott, S.J.; Zhu, M.; Tso, L.; Nguyen, H.-H.T.; Yip, J.H.-K.; Chan, S.I. Regio-and stereoselectivity of particulate methane monooxygenase from Methylococcus capsulatus (Bath). J. Am. Chem. Soc. 1997, 119, 9949–9955. [CrossRef] 60. Smith, T.; Dalton, H. Biocatalysis by methane monooxygenase and its implications for the petroleum industry. In Studies in Surface Science and Catalysis; Elsevier: Amsterdam, The Netherlands, 2004; Volume 151, pp. 177–192. 61. Johnson, E.L.; Hyman, M.R. Propane and n-butane oxidation by Pseudomonas putida GPo1. Appl. Environ. Microbiol. 2006, 72, 950–952. [CrossRef] 62. Li, L.; Liu, X.; Yang, W.; Xu, F.; Wang, W.; Feng, L.; Bartlam, M.; Wang, L.; Rao, Z. Crystal structure of long-chain alkane monooxygenase (LadA) in complex with coenzyme FMN: Unveiling the long-chain alkane hydroxylase. J. Mol. Biol. 2008, 376, 453–465. [CrossRef] 63. van Beilen, J.B.; Wubbolts, M.G.; Witholt, B. Genetics of alkane oxidation by Pseudomonas oleovorans. Biodegradation 1994, 5, 161–174. [CrossRef] 64. Stainthorpe, A.; Lees, V.; Salmond, G.P.; Dalton, H.; Murrell, J.C. The methane monooxygenase gene cluster of Methylococcus capsulatus (Bath). Gene 1990, 91, 27–34. [CrossRef] 65. Chan, S.I.; Chen, K.H.-C.; Yu, S.S.-F.; Chen, C.-L.; Kuo, S.S.-J. Toward delineating the structure and function of the particulate methane monooxygenase from methanotrophic bacteria. Biochemistry 2004, 43, 4421–4430. [CrossRef] 66. Tavormina, P.L.; Ussler, W., III; Joye, S.B.; Harrison, B.K.; Orphan, V.J. Distributions of putative aerobic methanotrophs in diverse pelagic marine environments. ISME J. 2010, 4, 700. [CrossRef][PubMed] 67. Meynet, P.; Head, I.M.; Werner, D.; Davenport, R.J. Re-evaluation of dioxygenase gene phylogeny for the development and validation of a quantitative assay for environmental aromatic hydrocarbon degraders. FEMS Microbiol. Ecol. 2015, 91.[CrossRef][PubMed] 68. Seeger, M.; Pieper, D. Genetics of biphenyl biodegradation and co-metabolism of PCBs. In Handbook of Hydrocarbon and Lipid Microbiology; Timmis, K.N., Ed.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 1179–1199. 69. Iwai, S.; Chai, B.; Sul, W.J.; Cole, J.R.; Hashsham, S.A.; Tiedje, J.M. Gene-targeted-metagenomics reveals extensive diversity of aromatic dioxygenase genes in the environment. ISME J. 2010, 4, 279–285. [CrossRef] [PubMed] Microorganisms 2020, 8, 774 16 of 16

70. Iwai, S.; Johnson, T.A.; Chai, B.; Hashsham, S.A.; Tiedje, J.M. Comparison of the specificities and efficacies of primers for aromatic dioxygenase gene analysis of environmental samples. Appl. Environ. Microbiol. 2011, 77, 3551–3557. [CrossRef][PubMed] 71. Grund, E.; Denecke, B.; Eichenlaub, R. Naphthalene degradation via salicylate and gentisate by Rhodococcus sp. strain B4. Appl. Environ. Microbiol. 1992, 58, 1874–1877. [CrossRef][PubMed] 72. Di Gennaro, P.; Terreni, P.; Masi, G.; Botti, S.; De Ferra, F.; Bestetti, G. Identification and characterization of genes involved in naphthalene degradation in Rhodococcus opacus R7. Appl. Microbiol. Biotechnol. 2010, 87, 297–308. [CrossRef][PubMed]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).