<<

Screening -producing environmental towards their application in bioremediation

Maria Cristina Lopes Matias

Thesis to obtain the Master of Science Degree in

Microbiology

Supervisor: Professor Rogério Paulo de Andrade Tenreiro Co-supervisor: Professor Nuno Gonçalo Pereira Mira

Examination Committee Chairperson: Professor Isabel Maria de Sá Correia Leite de Almeida Supervisor: Professor Rogério Paulo de Andrade Tenreiro Member of the Committee: Professor Ana Cristina Anjinho Madeira Viegas

October 2019

Declaration

I declare that this document is an original work of my own authorship and that it fulfills all the requirements of the Code of Conduct and Good Practices of the Universidade de Lisboa.

i Preface

The work presented in this thesis was performed at Lab Bugworkers | M&B-BioISI, Faculty of Sciences of the University of Lisbon (Lisbon, Portugal), during the period from September 2017 to October 2019, under the supervision of Professor Rogério Tenreiro. The thesis was co-supervised at Instituto Superior Técnico (Lisbon, Portugal) by Professor Nuno Mira.

ii Acknowledgments

This thesis is the end result of the work developed within the Microbiology & Biotechnology Group of Biosystems and Integrative Sciences Institute (M&B-BioISI), at Lab Bugworkers | M&B-BioISI, located in the Innovation Centre from the Faculty of Sciences of the University of Lisbon, Tec Labs. It would not have been successfully achieved without the help and collaboration of several people, to whom I would like to express my sincere gratitude.

First, and foremost, I would like to thank my supervisor, Professor Rogério Tenreiro, for challenging me with this project. I have to thank him for the patience for my endless questions, for letting me error along the way (and learn with those errors), and for all the ideas to improve my work. I would also like to thank Professor Nuno Mira, my internal supervisor at Instituto Superior Técnico, for his advice.

This work was based on a collection of man-made environmental naturally occurring bacteria originated from a collaboration between the Lab Bugworkers | M&B-BioISI and the company BioTask, Biotechnology Solutions, Lda., so I would like to thank BioTask for the courtesy of allowing me the use of the BBC collection in my work. In this context, a special thanks to Pedro Teixeira, who established the BBC collection in the course of his PhD, for his support, availability, and for all that I learned from him.

To Professor Ana Tenreiro, my thanks, for her reassuring presence, ideas, and for helping me with some of the most technical aspects of my work. To Filipa Silva, Chief Laboratory Officer of the Lab Bugworkers, for her availability, and for all that I learned from her; to her, and to the laboratory technician Francisca Moreira, my thanks for keeping the lab working efficiently. To Professor Lisete Fernandes, for the after-hours talks. To my colleagues in the Lab, with a special regard to my fellow colleagues doing their master thesis, for their companionship, and for all their support.

I would also like to thank my family. Without them, I would not have been here. A very special thanks to my sister, for the patience, and for drawing my attention for the little details on my ongoing thesis, even if she does not have a biological sciences background.

Last, but not least, I would like to thank Alexandre Calapez, for all the incentive and support when I decided to go back to school, and throughout all this period. Also, I am very grateful for all the long hours reviewing my thesis, and for all the advice that came to improve it.

iii Abstract

Hydrolases are an important type of , playing a valuable role in the turnover of nutrients, by promoting the depolymerization of macromolecules. are very potent catalysts, frequently secreted to the surrounding medium, most do not require cofactors, characteristics that make them desirable for biotechnological applications, in particular for environment protection, where hydrolase-producing organisms can be used in the removal of contaminants, in a process called bioremediation.

In this work, a set of 25 strains, selected from a collection of man-made environmental naturally occurring bacteria, was phenotypically characterized (laboratory routine tests, and the utilization of 23 substrates as sole carbon source), screened for 12 hydrolase activities (carboxylic ester hydrolases, deoxyribonucleases, eight glycosidases, peptidases, and ureases), and identified, by 16S rRNA gene sequencing and phylogenetic analysis.

From this set, two subsets of strains, one able to degrade , and one able to degrade cellulose, were selected for further assays, regarding their potential application in the biological treatment of wastewaters.

The most promising candidate was an sp. strain, isolated from a petrochemical wastewater treatment plant, found to be able to grow with urea as sole nitrogen source, and capable of urea and ammonium removal in synthetic wastewater. The strain is also able to grow in nutritious medium with up to 6.66% w/v of urea, as well as with up to 6.00% w/v of ammonium chloride. Such remarkable features pave the way for the future use of this strain as an asset for treatment of urea or ammonium rich wastewaters.

Keywords Hydrolase screening, bioremediation, wastewater treatment, urea, ammonium, cellulose.

iv Resumo

As hidrolases são um tipo de enzimas importante porque, ao promoverem a degradação de macromoléculas, desempenham um papel crucial no ciclo dos nutrientes. As hidrolases são catalisadores extremamente potentes, são geralmente excretadas para o exterior, e a maioria não requer cofatores, características que as tornam desejáveis para aplicações biotecnológicas, em particular no campo da proteção ambiental, onde organismos produtores de hidrolases podem ser utilizados na remoção de contaminantes. Este processo é designado por biorremediação.

Neste estudo, um grupo de 25 estirpes, selecionado a partir de uma coleção de bactérias de ocorrência natural em ambientes artificiais, foi alvo de caracterização fenotípica (testes de rotina laboratorial e utilização de 23 substratos diferentes como fonte única de carbono) e de prospeção de 12 atividades hidrolíticas (carboxil éster hidrolases, deoxirribonucleases, oito glicosidases, peptidases e ureases). As estirpes foram posteriormente identificadas a partir da sequenciação do gene 16S rRNA e de análise filogenética.

Deste grupo de 25 estirpes foram depois selecionados dois subgrupos para estudar a sua potencial aplicação no tratamento biológico de águas residuais: um com estirpes produtoras de ureases e um com estirpes capazes de hidrolisar celulose.

O candidato mais promissor encontrado foi uma estirpe do género Enterobacter, isolada de uma estação de tratamento de águas de uma instalação petroquímica. A estirpe é capaz de crescer com ureia como fonte única de azoto, e também capaz de remover ureia e amoníaco de água residual sintética. É ainda capaz de crescer em meio nutritivo suplementado com até 6.66% m/v de ureia, e em meio nutritivo com até 6.00% m/v de cloreto de amónio. Com estas características esta estirpe de Enterobacter pode vir a constituir um novo recurso no tratamento de águas residuais ricas em ureia ou amónia.

Palavras-chave Prospeção de hidrolases; biorremediação; tratamento de águas residuais; ureia, amónia; celulose.

v Table of Contents

Declaration ...... i Preface ...... ii Acknowledgments ...... iii Abstract ...... iv Resumo ...... v Table of Contents ...... vi List of Figures ...... x List of Tables ...... xi List of Acronyms & Abbreviations ...... xiii 1. Introduction ...... 1 1.1. Environmental sustainability, green chemistry, and catalysis ...... 1 1.2. Enzymes and biotechnological applications – an overview ...... 1 1.3. classification ...... 2 1.3.1. International Union of Biochemistry and Molecular Biology (IUBMB) classification ...... 2 1.3.2. Carboxylic ester hydrolases classification ...... 3 1.3.3. Proteolytic enzymes classification ...... 3 1.3.4. Carbohydrate-active enzymes classification ...... 3 1.4. Hydrolases...... 4 1.4.1. Carboxylic ester hydrolases ...... 4 1.4.2. Deoxyribonucleases ...... 5 1.4.3. Glycosidases...... 5 1.4.3.1. Starch degrading enzymes ...... 5 1.4.3.2. Plant cell wall degrading enzymes ...... 5 1.4.3.3. Other glycosidases ...... 7 1.4.4. Peptidases ...... 8 1.4.5. Ureases ...... 9 1.5. Biotechnological and industrial application of relevant hydrolases ...... 9 1.6. Screening for novel enzymes ...... 12 1.7. Enzymes of microbial origin...... 13 1.8. Bacteria identification for biotechnological applications ...... 14 1.9. Bioremediation – microorganisms and enzymes ...... 15 1.9.1. Green engineering, bioremediation, biostimulation, and bioaugmentation...... 15 1.9.2. Bacterial consortia ...... 16 1.9.3. Enzymes in bioremediation ...... 16 1.10. Wastewaters...... 16 1.11. The BioTask Bioremediation Culture collection ...... 17 1.12. Thesis scope ...... 18

vi 2. Materials and Methods ...... 19 2.1. Strain recovery, characterization, and selection ...... 19 2.1.1. Initial set of strains ...... 19 2.1.2. Strain recuperation, growth conditions, and routine maintenance ...... 19 2.1.3. Phenotypic confirmation ...... 19 2.2. Reduction of the number of strains, and selection of the working set ...... 19 2.2.1. Experimental methodology ...... 20 2.2.2. Data analysis ...... 20 2.3. Characterization of the strains in the working set ...... 21 2.3.1. Sole carbon source assay (metabolic characterization) ...... 21 2.3.1.1. Experimental methodology ...... 21 2.3.1.2. Data analysis ...... 22 2.3.2. Biomass production assay ...... 22 2.3.2.1. Experimental methodology ...... 22 2.3.2.2. Data analysis ...... 23 2.3.3. Influence of environmental factors in strain growth ...... 23 2.3.3.1. Experimental methodology ...... 23 2.3.3.2. Data analysis ...... 23 2.4. Screening of hydrolytic enzymes ...... 23 2.4.1. Carboxylesterase activity ...... 24 2.4.2. DNase activity ...... 24 2.4.3. activity ...... 24 2.4.4. activity ...... 24 2.4.5. activity ...... 25 2.4.6. β-glucosidase activity ...... 25 2.4.7. activity ...... 25 2.4.8. Pectinase activity ...... 25 2.4.9. activity ...... 26 2.4.10. Peptidase activity ...... 26 2.4.11. Urease activity ...... 26 2.5. Strain identification ...... 27 2.5.1. Total genomic DNA extraction ...... 27 2.5.2. Partial amplification and sequencing of 16S rRNA gene ...... 27 2.5.3. Sequence analysis and phylogenetic reconstruction ...... 28 2.6. Case study – urea and ammonium removal for wastewater treatment ...... 29 2.6.1. Preliminary test ...... 29 2.6.2. Growth assay – urea versus ammonium chloride as sole nitrogen source...... 29 2.6.2.1. Experimental methodology ...... 29

vii 2.6.2.2. Data analysis ...... 30 2.6.3. Respiration assay – urea versus ammonium chloride as sole nitrogen source ...... 31 2.6.3.1. Experimental methodology ...... 31 2.6.3.2. Data analysis ...... 31 2.6.4. Urea and ammonium chloride tolerance assays ...... 31 2.6.4.1. Experimental methodology ...... 32 2.6.4.2. Data analysis ...... 32 2.7. Case study – cellulose degradation for wastewater treatment ...... 32 2.7.1. Experimental methodology ...... 32 2.7.2. Data analysis ...... 33 2.8. Workflow of the study ...... 33 3. Results and Discussion ...... 34 3.1. Rapid growth screening, and reduction of the number of strains ...... 34 3.2. Phenotypic characterization of the strains in the working set ...... 36 3.2.1. Physiological relationship between strains ...... 36 3.2.2. Sole carbon source utilization ...... 37 3.2.2.1. Sugars ...... 39 3.2.2.2. Sugar alcohols ...... 41 3.2.2.3. Esters ...... 43 3.2.2.4. Integration of all the results in the sole carbon source assay ...... 44 3.2.3. Biomass production ...... 45 3.2.4. Influence of environmental factors ...... 47 3.2.4.1. Temperature ...... 47 3.2.4.2. Salinity ...... 47 3.2.4.3. pH ...... 47 3.2.5. Integration of the results in the phenotypic characterization ...... 51 3.3. Enzyme screening ...... 51 3.3.1. Carboxylic ester hydrolases ...... 51 3.3.2. Deoxyribonucleases ...... 53 3.3.3. Glycosidases...... 53 3.3.3.1. Agarases ...... 53 3.3.3.2. Dextranases ...... 53 3.3.3.3. ...... 53 3.3.3.4. β- ...... 54 3.3.3.5. ...... 54 3.3.3.6. Pectinases ...... 55 3.3.3.7. ...... 55 3.3.3.8. β-galactosidases ...... 55

viii 3.3.4. Peptidases ...... 56 3.3.5. Ureases ...... 56 3.3.6. Integration of all results in the hydrolase screening ...... 56 3.4. Identification of the strains in the working set ...... 58 3.4.1. Identification, comparison of phenotypic characteristics, and pathogenicity analysis ...... 58 3.4.2. Comparison between closely phylogenetically related strains ...... 65 3.5. Case study – Urea and ammonium removal for wastewater treatment ...... 66 3.5.1. Preliminary test – growth in synthetic wastewaters ...... 66 3.5.2. Growth in synthetic wastewater with urea or ammonium as sole nitrogen source ...... 67 3.5.3. Respiration assay ...... 69 3.5.4. Urea and ammonium tolerance ...... 69 3.6. Case study – Cellulose degradation for wastewater treatment ...... 71 4. General Conclusions and Future Perspectives ...... 74 References ...... 77 Appendices ...... 99 Appendix A List of hydrolases ...... 100 Appendix B Applications of relevant hydrolases...... 101 Appendix C BioTask Bioremediation Culture collection ...... 102 Appendix D Rapid growth assay – Principal component analysis ...... 103 Appendix E Phenotypic characterization of strains in the working set ...... 105 Appendix F Utilization of sugars, sugar alcohols and esters ...... 107 Appendix G Enzyme producers ...... 109 Appendix H Phylogenetic trees of the strains in the working set ...... 111

ix List of Figures

Figure 1. Workflow of the study...... 33 Figure 2. Dendrogram displaying the relationship between strains in the initial set...... 35 Figure 3. Spatial dispersion of the strains in the initial set by NAUCs for 36 h in TSB and NB...... 36 Figure 4. Dendrogram displaying the relationship between strains in the working set...... 37 Figure 5. Relative growth for BBC|161 and BBC|170 for all substrates in the sole carbon source...... 38 Figure 6. Relative growth for all strains in the working set using , mannose, or cellobiose as sole carbon source...... 38 Figure 7. Number of strains with strong positive growth by substrate as sole carbon source...... 39 Figure 8. Scatterplots showing the growth relation between the isomer aldohexoses tested as sole carbon source...... 40 Figure 9. Scatterplots showing the growth relation between the isomer pentoses tested as sole carbon source...... 40 Figure 10. Scatterplots showing the growth relation between cellobiose and , and between the disaccharides and the monosaccharides that result from their ...... 41 Figure 11. Scatterplots showing the growth relation between glycerol and , glycerol and , and between mannitol and sorbitol when present as sole carbon source...... 42 Figure 12. Scatterplots showing the growth relation between the mannitol and mannose, mannitol and fructose, and sorbitol and glucose, when present as sole carbon source...... 43 Figure 13. Scatterplots showing the growth relation between the Tweens tested as sole carbon source. .... 44 Figure 14. NAUC evolution through time in the biomass assay...... 46 Figure 15. Effect of temperature in NAUC evolution through time...... 48 Figure 16. Effect of salinity in NAUC evolution through time...... 49 Figure 17. Effect of pH in NAUC evolution through time...... 50 Figure 18. Number of strains with positive results by hydrolase activity in the working set...... 57 Figure 19. Number of positive and strong positive hydrolase activities by strain...... 57 Figure 20. Phylogenetic trees for strains BBC|024 and BBC|035...... 58

Figure 21. Viable cell count (log10 cfu/ml) over time in synthetic wastewater with urea or ammonium chloride as sole nitrogen source...... 67

Figure 22. Viable cell count (log10 cfu/ml) over time of strain BBC|003 in TSB supplemented with increasing concentration of urea or ammonium chloride...... 70

Figure 23. Viable cell count (log10 cfu/ml) over time in base medium, and base medium with added 1% w/v of CMC or Avicel...... 72

Supplementary Figure 1. Composite dendrograms for all BBC strains...... 102 Supplementary Figure 2. Scatterplots showing the relation between NAUCs at 36 h and 24-48 h in TSB. 103 Supplementary Figure 3. Scatterplots showing the relation between NAUCs at 36 h and 24-48 h in NB. . 103 Supplementary Figure 4. Spatial dispersion of the initial set in the PCA space defined by PC1 and PC2. . 104 Supplementary Figure 5. Phylogenetic trees for the strains in the working set...... 111

x List of Tables

Table 1. Main Classes in the Enzyme Commission system...... 3 Table 2. Categories in EC 3 hydrolases...... 4 Table 3. Most relevant starch degrading glycosidases...... 6 Table 4. Most relevant plant cell wall degrading glycosidases...... 8 Table 5. Substrates used in the metabolic characterization...... 22 Table 6. NAUCs for 36 h incubation in TSB and NB for clusters I, II, and IV...... 35 Table 7. Phenotypic characterization – Sugars, sugar alcohols, and esters utilization by each strain ...... 45 Table 8. NAUCs for 120 h of incubation in the biomass assay...... 46 Table 9. NAUCs for 72 h of incubation in the temperature assay...... 48 Table 10. NAUCs for 120 h of incubation in the salinity assay...... 49 Table 11. NAUCs for 120 h of incubation in the pH assay...... 50 Table 12. Phenotypic characterization – Best growth conditions for each strain...... 51 Table 13. Enzyme production in the working set...... 57 Table 14. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for strain BBC|024 and all validly published species in the genus Limnohabitans...... 58 Table 15. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for strain BBC|035 and phylogenetically related Bacillus species...... 59 Table 16. Strain identification from partial 16S rRNA gene sequencing...... 60 Table 17. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for all strains closely related to Staphylococcus cohnii subsp. cohnii...... 61

Table 18. Viable cell count (log10 cfu/ml) of strains BBC|003, BBB|016, BBC|020, and a consortium off the three strains, in synthetic wastewater with urea or ammonium chloride as sole nitrogen source...... 67 Table 19. Mean AUC values for each 24 h interval in the respiration assay...... 69

Table 20. Viable cell count (log10 cfu/ml) of strain BBC|003 in TSB with added urea...... 70

Table 21. Viable cell count (log10 cfu/ml) of strain BBC|003 in TSB with added ammonium chloride...... 71

Table 22. Viable cell count (log10 cfu/ml) of strains BBC|032, BBC|077, and a consortium of the two strains, in base medium, and base medium with added 1.00% w/v of CMC or Avicel...... 72

Supplementary Table 1. EC number, accepted name, and systematic name for the hydrolases referred to in the present thesis...... 100 Supplementary Table 2. Applications of some industrial and biotechnological relevant hydrolases...... 101 Supplementary Table 3. Eigenvalues and respective percentage of variance for the first 13 PCs...... 104 Supplementary Table 4. Explanatory variables, and their explanatory value, for PC1 and PC2...... 104 Supplementary Table 5. NAUCs for 36 h of incubation in TSB and for all substrates in the sole carbon source assay...... 105 Supplementary Table 6. Descriptive statistics for NAUCs for 36 h of incubation in TSB and for all substrates in the sole carbon source assay...... 105

xi Supplementary Table 7. NAUCs for biomass production, and effect of environmental factors (temperature, salinity, and pH) in strain growth...... 106 Supplementary Table 8. Descriptive statistics for NAUCs biomass production, and effect of environmental factors (temperature, salinity, and pH) in strain growth...... 106 Supplementary Table 9. Utilization of sugars as sole carbon source for each strain...... 107 Supplementary Table 10. Utilization of sugar alcohols as sole carbon source for each strain...... 107 Supplementary Table 11. Utilization of esters as sole carbon source for each strain...... 108 Supplementary Table 12. Best carboxyl ester hydrolases producers in the working set...... 109 Supplementary Table 13. Best β-glucosidase producers in the working set...... 109 Supplementary Table 14. Best pectin degrading enzymes producers in the working set...... 110 Supplementary Table 15. Best peptidase producers in the working set...... 110

xii List of Acronyms & Abbreviations

ABPP Activity-based protein profiling ASRA Adaptive Substituent Reordering Algorithm AUC Area under the curve BBC BioTask Bioremediation Culture BHI Brain-Heart Infusion Broth BLASTn Nucleotide-nucleotide Basic Local Alignment Search Tool bp Base pairs CASTLE Carboxylic ester hydrolase enzymes database CAZy Carbohydrate-active enzymes database CAZyme Carbohydrate-active enzyme CEH Carboxylic ester hydrolase CFU Colony forming unit CMC Carboxymethyl cellulose CV Coefficient of variation DDH DNA-DNA hybridization DNase Deoxyribonuclease dNTP Deoxyribonucleotide triphosphate EC Enzyme Commission EDTA Ethylenediaminetetraacetic acid EPS Extracellular polymeric substances EtBr Ethidium bromide ExPASy Expert Protein Analysis System ExplorEnz Enzyme database of the IUBMB Enzyme Nomenclature List GRAS Generally recognized as safe HAI Healthcare associated infection IMG/M Integrated Microbial Genomes and Microbiomes IUBMB International Union of Biochemistry and Molecular Biology IUPAC International Union of Pure and Applied Chemistry LB LB Broth (Miller) MDR Multidrug Resistance (efflux pumps) MEROPS Peptidase database MPN Most probable number MSP-PCR Microsatellite/minisatellite primed PCR NAUC Net area under the curve NB Nutrient Broth NCBI National Center for Biotechnology Information

xiii NC-IUBMB Nomenclature Committee of the International Union of Biochemistry and Molecular Biology OD Optical density ODr Relative optical density OTU Operational taxonomic unit PC Principal component PCA Principal component analysis PDB Protein Data Bank pMGs Phylogenetic marker genes QPS Qualified presumption of safety (status) RAPD Random amplification of polymorphic DNA RCF Relative centrifugal force RDP Ribosomal database project RPM Revolutions per minute SAHN Sequential agglomerative hierarchical non-overlapping (clustering method) SDS Sodium dodecyl sulfate SIB Swiss Institute of Bioinformatics STDEV Sample standard deviation TBE Tris-borate-EDTA TE Tris-EDTA TMPD N,N,N',N'-tetramethyl-p-phenylenediamine Tris Tris(hydroxymethyl)aminomethane TSA Trypto-casein Soy Agar TSB Trypto-casein Soy Broth UPGMA Unweighted pair group method with arithmetic average UTI Urinary tract infection WWTP Wastewater treatment plant YNB Yeast Nitrogen Base Without Amino Acids Broth YNBA Yeast Nitrogen Base Without Amino Acids Agar

xiv 1. Introduction 1. Environmental sustainability, green chemistry, and catalysis 2. Enzymes and biotechnological applications – an overview 3. Enzyme classification 4. Hydrolases 5. Biotechnological applications of relevant hydrolases 6. Screening for novel enzymes 7. Enzymes of microbial origin 8. Bacteria identification for biotechnological applications 9. Bioremediation – microorganisms and enzymes 10. Wastewaters 11. The BioTask Bioremediation Culture collection 12. Thesis scope

1.1. Environmental sustainability, green chemistry, and catalysis

Environmental sustainability aims at “meeting the resource and services needs of current and future generations without compromising the health of the ecosystems that provide them” [1].

A new paradigm, green chemistry, has arrived in response to the increasing global concern with environmental sustainability, resources depletion, and the growing chemical pollution problem. The green chemistry appeared about 30 years ago, between the late 1980s and the early 1990s [2]. In 1998, Anastas and Warner [3] defined green chemistry as "the invention, design, and application of chemical products and processes to reduce or to eliminate the use and generation of hazardous substances." This definition was formally adopted by the Working Party on Synthetic Pathways and Processes in Green Chemistry of the International Union of Pure and Applied Chemistry (IUPAC) Commission on Physical Organic Chemistry in the year 2000 [4]. Also in 1998, Anastas and Warner [5] introduced the 12 principles of the green chemistry: 1) prevention; 2) atom economy; 3) less hazardous chemical syntheses; 4) designing safer chemicals; 5) safer solvents and auxiliaries; 6) design for energy efficiency; 7) use of renewable feedstocks; 8) reduce derivatives; 9) catalysis; 10) design for degradation; 11) real-time analysis for pollution prevention; and 12) inherently safer chemistry for accident prevention [3]. The ninth principle states that the use of catalytic reagents should be encouraged, owing to their superior performance when compared to stoichiometric reagents. Catalysis is, therefore, one of the fundamental pillars of the green chemistry, as the use of catalysts allows to achieve simultaneously “the dual goals of environmental protection and economic benefit” [6].

1.2. Enzymes and biotechnological applications – an overview

Enzymes are macromolecules that act as biocatalysts [7,8], and are present in all forms of life [9], as they enable organisms to efficiently perform the chemical reactions crucial to life under physiological conditions [8]. Most of the known enzymes are proteins, and their activity is almost always dependent on their native conformation [8]. Temperature and pH are important for enzymatic activity, because high temperatures may

1 cause proteins to denature, and pH may interfere with the ionization of groups in the active site of the enzymes, hindering their activity [9]. Enzymes can also be ribonucleic acid (RNA) based molecules with catalytic properties (ribozymes), which were discovered during the 1980s [10], or antibodies with catalytic activity (abzymes), first obtained in 1986 [11].

Enzymes are extremely potent catalysts. One such example is the carbonic anhydrase, that catalyzes the interconversion of carbon dioxide and water to bicarbonate and protons, and vice versa, with a turnover of 600 000 s-1 [9]. Like all catalysts, enzymes do not shift the equilibrium of a chemical reaction, they merely increase the rate of a reaction by lowering its activation energy [12], and are not consumed in the process [9]. Some enzymes may require cofactors (ions and coenzymes) to become active [8], and its activity may be influenced by the presence of inhibitors [9]. Also, enzymes show high specificity (generally acting on only one type of substrate, or on a specific functional group) [9], regiospecificity (acting only on a specific regioisomer), and stereospecificity (acting only on one enantiomer in a racemic mixture) [7].

Enzymes are environmentally friendly, because they are biodegradable and produce lesser amounts of waste materials than more traditional production methods. Enzyme utilization allows manufacturing processes to occur at more moderate temperatures (and lower costs), or at less extreme pH values. The use of enzymes also reduces the need for organic solvents and other, potentially harmful, reagents [13].

Enzymes are used in large quantities in agriculture, food, and animal feed production, textile and paper manufacturing, as components of detergents, for biofuel production, in pharmaceuticals, in fine chemicals production, and in the cosmetic industry [12,13]. Enzymes are equally used for research purposes, as therapeutic agents, and in analytical assays and devices [9]. Enzymes from microbial origin have been reported for potential application in bioremediation due to their ability to reduce the toxicity of contaminants [14,15]. In terms of diversity, most commercial enzymes are for research purposes. The gross production comprises, though, enzymes for industrial applications. The leading enzymes in the market are hydrolases, namely proteases, amylases, cellulases, and lipases [9,16]. This dominance is due to their stability, efficiency, substrate specificity [17], and to the fact that hydrolases are usually extracellular enzymes (exoenzymes). For this reason, hydrolase extraction can be generally accomplished by filtration or centrifugation, not requiring cell lysis [18].

The demand for enzymes is on the rise, due to the increasing need for ecological and more sustainable solutions [19]. The global market for enzymes for industrial applications was near $5 billion in 2016, and is expected to exceed $6 billion in 2021, with a compound annual growth rate of 4.7% for the five-year period 2016-2021 [20].

1.3. Enzyme classification 1.3.1. International Union of Biochemistry and Molecular Biology (IUBMB) classification Based on the type of reactions that they catalyze, enzymes are divided into seven classes (see Table 1). Each class of enzymes is given an (EC number), and a name, as recommended by the Nomenclature Committee of the IUBMB (NC-IUBMB) [21,22]. Each class is then subdivided into more three subsequent categories. Therefore, an enzyme has always a four-digit EC number. The primary repository for all enzymes, named and classified by the IUBMB, is the database ExplorEnz, hosted at http://www.enzyme-database.org/ [22,23].

2 Table 1. Main Classes in the Enzyme Commission system. No.* Class Reaction EC 1 Oxidoreductases Catalyze oxidation-reduction reactions (transfer of electrons between molecules). EC 2 Transferases Catalyze group transfer reactions. EC 3 Hydrolases Catalyze hydrolysis of chemical bonds. EC 4 Lyases Catalyze addition of groups to double bonds or the inverse reaction (formation of double bonds by group removal). EC 5 Isomerases Catalyze isomerization (structural change of the substrate). EC 6 Ligases Catalyze ligation of substrates coupled to ATP cleavage. EC 7 Translocases Catalyze the movement of ions or molecules across membranes. * Data retrieved from [8,9,12] and curated with information from the IUBMB Biochemical Nomenclature web portal, hosted at https://www.qmul.ac.uk/sbcs/iubmb/enzyme/.

1.3.2. Carboxylic ester hydrolases classification In 2016, Chen, Black, and Reilly [24] presented a classification for carboxylic ester hydrolases (CEHs), based on their primary, secondary, and tertiary structures. CEHs were grouped into 91 families, based on their amino acid sequence. Families with closely superimposing secondary and tertiary structures were grouped into five clans (from A to E), even if their primary structure did not reveal high levels of sequence similarity. The CEHs classification can be consulted at CASTLE: The database of CArboxylic eSTer hydroLasE enzymes, hosted at https://castle.cbe.iastate.edu/.

1.3.3. Proteolytic enzymes classification In 1993, Rawlings and Barrett [25] proposed a hierarchical classification for peptidases. In this classification peptidases and protein inhibitors are grouped together into protein-species. Protein-species are assembled into evolutionary families, and in some cases, families are clustered into clans, these last with more distant relationships [25]. Peptidase families comprise proteins with related amino acid sequences of the site of the protein responsible for the catalytic or the inhibitory activity, and clans correspond to clusters of families with related tertiary structures [26]. Eleven years later, in 2004, Rawlings, Tole, and Barrett [27] proposed a classification for proteins that act as peptidase inhibitors, also based in amino acid sequence. Peptidase and peptidase inhibitors classification can be consulted at MEROPS, the peptidase database, hosted at https://www.ebi.ac.uk/merops/ [26,28,29].

1.3.4. Carbohydrate-active enzymes classification Complex carbohydrates are made of backbones of linked monosaccharide residues, with substituents, appearing in numerous combinations in nature. Carbohydrate high diversity is accompanied by an also high diversity of enzymes involved in their synthesis, , and modification [30]. In 1991, Henrissat [31] proposed a classification for glycosyl hydrolases, based on their primary structure. Several following works resulted in an enzyme classification in use for more than two decades, where families of enzymes are defined by amino acid sequence similarity [30]. Carbohydrate-active enzymes (CAZymes) are divided into five families of enzymes that degrade, modify, or create glycosidic bonds: GH#, for glycoside hydrolases; GT#, for ; PL#, for polysaccharide lyases; CE#, for carbohydrate esterases, and CBM#, for carbohydrate-binding modules. These families are subsequently subdivided into subfamilies and clans1.

1 Definitions and Terminology in CAZy, http://www.cazy.org/Definitions-and-Terminology.html.

3 CAZymes classification, structure, and molecular mechanism for each enzyme, can be consulted at the Carbohydrate-Active enZYmes database (CAZy), hosted at http://www.cazy.org/ [31].

1.4. Hydrolases

Hydrolases (class 3 by the NC-IUBMB) are important enzymes for the depolymerization of proteins, carbohydrates, and nucleic acids [8]. These enzymes catalyze the hydrolysis of chemical bonds, using water as a nucleophile (electron pair donor) [17], following the reaction [21,22]:

A-B + H2O ⇌ A-OH + B-H (1) Hydrolases are subdivided into 13 categories, from EC 3.1 to EC 3.13, depending on the type of chemical bond they act upon (see Table 2).

Table 2. Categories in EC 3 hydrolases. No.* Class EC 3.1 Acting on ester bonds EC 3.2 EC 3.3 Acting on ether bonds EC 3.4 Peptidases EC 3.5 Acting on carbon-nitrogen bonds, other than peptide bonds EC 3.6 Acting on acid anhydrides EC 3.7 Acting on carbon-carbon bonds EC 3.8 Acting on halide bonds EC 3.9 Acting on phosphorus-nitrogen bonds EC 3.10 Acting on sulfur-nitrogen bonds EC 3.11 Acting on carbon-phosphorus bonds EC 3.12 Acting on sulfur-sulfur bonds EC 3.13 Acting on carbon-sulfur bonds * Data retrieved from the IUBMB Biochemical Nomenclature portal, hosted at https://www.qmul.ac.uk/sbcs/iubmb/enzyme/EC3/.

As of October 15, 2019, 26 675 macromolecular structures, representing 35.24% of all enzymes (75 702) in the Protein Data Bank (PDB), were from hydrolases2 [32].

See Supplementary Table 1 in Appendix A for EC number and names of the hydrolases mentioned in this thesis.

1.4.1. Carboxylic ester hydrolases CEHs, or carboxylesterases (EC 3.1.1.-), catalyze the hydrolysis of ester bonds, resulting in a molecule of alcohol, and one or more carboxylic acids [33,34], following the reaction:

Carboxylic ester + H2O ⇌ Alcohol + Carboxylic acid (2) The two most relevant CEHs for industrial and biotechnological applications are non-specific esterases (EC 3.1.1.1), and “true” lipases (EC 3.1.1.3) [33]. Non-specific esterases and lipases can be differentiated from each other by substrate specificity, and by their kinetic behavior [33]. Both esterases and lipases have the potential to catalyze the synthesis or the hydrolysis of esters [35–37]. Non-specific esterases are active on water-soluble short acyl chain esters, in very low concentration solutions, but not on emulsions of water-insoluble esters. On the other hand, lipases have a wider broad of substrate range, being active in emulsions of insoluble esters, and in solutions of short acyl chain esters [33,38]. Lipases may also show interfacial activation (an increase of activity when

2 PDB Data Distribution by Enzyme Classification, https://www.rcsb.org/stats/enzyme.

4 the substrate starts to form an emulsion) [35], associated with a moving structure in the active site of some lipases, called the lid [39,40]. Lipases with interfacial activation do not obey classical Michaelis-Menten kinetics (only valid for soluble enzymes and substrates), whereas non-specific esterases do [35,41].

1.4.2. Deoxyribonucleases Deoxyribonucleases (DNases) catalyze the hydrolysis of phosphodiesters in deoxyribonucleic acid (DNA). DNases can be divided into endonucleases (EC 3.1.21.-), if the catalyzed reaction occurs within the molecule, thus producing smaller polynucleotide chains; or exonucleases, if the cleavage occurs in the extremities of the DNA molecule, releasing deoxyribonucleotide residues [8,12,42]. Exonucleases acting in the 3’→5’ extremity are numbered EC 3.1.11.-3, and exonucleases active in the 5’→3’ extremity are numbered EC 3.1.12.-4.

1.4.3. Glycosidases Glycosidases (EC 3.2.1.-) catalyze the hydrolysis of glycosidic bonds in the most diverse biopolymers: carbohydrates, and glycoconjugates [43,44]. Carbohydrates are molecules composed mainly of carbon, hydrogen, and oxygen [7], that can appear as monosaccharides, disaccharides, oligosaccharides or polysaccharides. Polysaccharides composed of a single kind of monosaccharide are called homopolysaccharides or homoglycans (e.g., cellulose, composed only of linked glucose residues); polysaccharides with two or more types of monosaccharides are called heteropolysaccharides or heteroglycans (e.g., hyaluronic acid, composed of alternating units of N-acetyl glucosamine and glucuronic acid) [7].

Glucans, a special type of glycans, are linear or branched homopolymers of glucose. Glucans can function as structural or as reserve carbohydrates. These polysaccharides are divided into α-, β-, and α,β-glucans, depending on the glycosidic bonds between the glucose residues. Starch and cellulose are the most relevant glucans among higher plants [45].

1.4.3.1. Starch degrading enzymes Starch is one of the most important reserve molecules in plants. It is a homopolysaccharide with 50 to several thousands of glucose residues [46,47]. A molecule of starch has two components: amylose and amylopectin. Amylose is a linear polymer with α-1,4 glycosidic bonds between glucose residues. Amylopectin is a branched polymer with α-1,4 and α-1,6 bonds, similar to glycogen, a storage polysaccharide in animals, fungi and bacteria. Amylopectin is the major component of starch, corresponding to more than 70% of the starch molecule [8,46,47]. Complete hydrolysis of starch may require up to seven distinct enzymes, described in Table 3.

1.4.3.2. Plant cell wall degrading enzymes The dry matter of plant cell walls is primarily composed of polysaccharides, like cellulose, hemicelluloses, and pectins [48], and by lignins [49,50].

3 ENZYME class: 3.1.11, https://enzyme.expasy.org/EC/3.1.11.-. 4 ENZYME entry: EC 3.1.12.1, https://enzyme.expasy.org/EC/3.1.12.-.

5 Table 3. Most relevant starch degrading glycosidases. Enzymes* Class Reaction Product Endoamylases α-amylase EC 3.2.1.1 Cleavage of internal α-1,4 bonds in Dextrins# polysaccharides containing at least three α-1,4 linked glucose units Exoamylases β-amylase EC 3.2.1.2 Cleavage of α-1,4 bonds from the Maltose† non-reducing terminal Glucan EC 3.2.1.3 Cleavage of α-1,4 bonds on the Glucose 1,4-α-glucosidase non-reducing terminal α-glucosidase EC 3.2.1.20 Cleavage of α-1,4 bonds on the Glucose non-reducing terminal Debranching EC 3.2.1.41 Cleavage of α-1,6 bonds in pullulanҰ, Maltotriose‡ enzymes amylopectin and in the α- and β-limit dextrins of amylopectin EC 3.2.1.68 Cleavage of α-1,6 branch bonds in Malto-oligosaccharides¥ amylopectin and their β-limit dextrins EC 3.2.1.142 Cleavage of α-1,6 bonds in Maltose amylopectin, pullulan, and α- and β-limit dextrins of amylopectin * Data retrieved from [19,37,51] and curated with information from Enzyme nomenclature database from the SIB ExPASy Bioinformatics Resources Portal, hosted at https://enzyme.expasy.org/ [52]. # Dextrins5 are low-molecular-weight carbohydrates of glucose with α-1,4 and α-1,6 bonds, derived from the hydrolysis of starch, used as binding and thickening agents in food processing and pharmaceuticals. † Maltose6 is a disaccharide of glucose linked by an α-1,4 bond, used for sweetening, present as an intermediate sugar in brewing. ‡ Maltotriose is a trisaccharide of α-1,4 bond glucose residues. Ұ Pullulan is a linear carbohydrate of maltotriose units linked by α-1,6 bonds [47]. ¥ Malto-oligosaccharides are low-weight polymers of glucose units linked as maltose.

The cell wall of plants has two layers: a primary cell wall, adjacent to the plasma membrane, mainly composed of cellulose, hemicellulose, and pectins, associated in a matrix; and a middle lamella, rich in pectin, exterior to the primary wall, located in the space between adjacent cells. In addition to these, other polymers, like lignin, which is deposited for strengthening and waterproofing purposes, may be present in the cell wall. Some cells may develop a secondary cell wall (a third layer), adjacent to the plasma membrane, composed mostly of cellulose and hemicellulose [53].

Cellulose comprises approximately 25-30% of the dry weight of the cell wall, making it the most abundant polysaccharide on Earth [48,49]. Cellulose is a water-insoluble linear homopolymer with at least 500 glucose residues linked by β-1,4 bonds. It is relatively resistant to degradation, due to its crystalline structure of microfibrils, kept together by hydrogen bonds [48,49]. Complete hydrolysis of cellulose, produces glucose, requiring at least the presence of three types of glycosidases: endoglucanases, cellobiosidases, and β-glucosidases [49,54–57].

Hemicelluloses comprehend branched polysaccharides, with a linear homopolymer backbone of β-1,4 linked units of a type of sugar, and short side chains of other sugar residues. This type of molecules represent 15-20% of the cell wall dry weight [48]. In addition to their structural role, hemicelluloses can also function as reserve carbohydrate (e.g., mannan in seeds) [58]. Depolymerization of hemicelluloses yields, beyond glucose, other monosaccharides, like xylose, mannose, , and arabinose [57,59]. Hydrolysis of hemicelluloses requires several glycosidases, involved in the

5 NCBI. PubChem Database. Dextrin, CID=62698, https://pubchem.ncbi.nlm.nih.gov/compound/dextrin. 6 NCBI. PubChem Database. Maltose, CID=439341, https://pubchem.ncbi.nlm.nih.gov/compound/maltose.

6 hydrolysis of the glycosidic bonds in the molecule, and esterases, to remove branched esters (mainly acetate and ferulic side groups) [60]. Hemicelluloses include xyloglucans, xylans, mannans, and glucomannans, present in all terrestrial plants, and β-(1,3;1,4)-glucans (mixed-linkage glucans) present in Poales (order of monocotyledonous flowering plants that includes the grasses) [58]. Xylans are branched polysaccharides with a backbone of xylose residues [59]. Complete xylan hydrolysis requires debranching enzymes, like α-glucuronidases, esterases, and α-arabinosidases, responsible for the removing of branched chains, as well xylanases and β-xylosidases [61].

Pectins are a heterogeneous group of polysaccharides rich in galacturonic acid residues linked by α-1,4 bonds, with esterified acetyl or methyl groups [48,49]. Pectins are a major component of the middle lamella, representing up to 35% of the cell wall dry weight [48,49,62]. These polysaccharides are mainly divided into homogalacturans, and type I and type II rhamnogalacturans (RG I and RG II) [49,63]. Complete pectin hydrolysis requires, apart from several glycosidases, the presence of pectinesterases, and pectin lyases [64–67]. Pectin acetyl esterases (EC 3.1.1.-) remove acetyl groups, producing pectic acid and the corresponding alcohol (methanol or acetic alcohol). Pectinesterases (EC 3.1.1.11), a type of pectin acetyl esterase, catalyzes the removal of methyl groups [63]. Pectin and pectate lyases catalyze the transelimination of glycosidic linkages in, respectively, low and high esterified pectin [63].

Lignins are amorphous cross-linked aromatic heteropolymers. One of the main roles of lignins is to bind cellulose to other cell wall components in vascular plants and red algae [50,57,68]. Due to its complex structure, lignin, which can represent a renewable source of aromatic compounds, poses a biotechnological challenge in its enzymatic degradation, requiring several oxidases, peroxidases, laccases, and accessory enzymes [69].

See Table 4 for information on the most relevant glycosidases required for plant cell wall degradation.

1.4.3.3. Other glycosidases Agar is a reserve polysaccharide of alternate D- and L-galactose, linked by α-1,3 and β-1,4 bonds [70,71], associated with cellulose in agarophytes [70]. The agar molecule has two components: agarose and agaropectin. Agarose is a linear polymer, responsible for the agar gelling power, comprising around 70% of the agar molecule. Agaropectin, representing the remaining 30%, is a sulphated branched polymer [70,72,73]. Agarose is hydrolyzed by agarases: α-agarases (EC 3.2.1.158) cleave internal α-1,3 bonds, and β-agarases (EC 3.2.1.81) cleave the β-1,4 bonds [74,75].

Lactose is the most important sugar in milk. It is a disaccharide of glucose and fructose linked by β-1,4 glycosidic bonds [12,76]. Lactose is hydrolyzed by the enzyme (EC 3.2.1.108)7 from the intestinal mucosa, following the reaction:

Lactose + H2O ⇌ D-galactose + D-glucose (3)

7 Enzyme entry: EC 3.2.1.108, https://enzyme.expasy.org/EC/3.2.1.108.

7 Table 4. Most relevant plant cell wall degrading glycosidases. Enzymes* Class Reaction Product Cellulose Cellulase EC 3.2.1.4 Cleavage of internal β-1,4 bonds in Cellulose degrading (endoglucanase) cellulose, lichenin#, and cereal oligosaccharides enzymes β-D-glucans β-glucosidase EC 3.2.1.21 Cleavage of β-1,4 bonds from the Glucose (exoglucanase) non-reducing terminal Cellulose EC 3.2.1.91 Cleavage of β-1,4 bonds from the Cellobiose† 1,4-β-cellobiosidase non-reducing terminal (exoglucanase) Xylan Endo-1,4-β-xylanase EC 3.2.1.8 Cleavage of internal β-1,4 bonds in Xylan degrading (endoxylanase) xylans oligosaccharides enzymes Xylan EC 3.2.1.37 Cleavage of β-1,4 bonds from the Xylose 1,4- β -xylosidase non-reducing terminal Non-reducing end EC 3.2.1.55 Cleavage of α-L-arabinofuranoside Arabinofuranoside α-L-arabinofuranosidase residues from the non-reduction terminal in α-L- arabinosides α-glucuronidase EC 3.2.1.139 Cleavage of D-glucuronate from Glucuronate α-D-glucuronosides Pectin EC 3.2.1.15 Cleavage of internal α-1,4 bonds from Pectin degrading (endopectinase) pectate and other galacturonans oligosaccharides enzymes Galacturan EC 3.2.1.67 Cleavage of α-1,4 bonds from the Monogalacturonate 1,4-α-galacturonidase non-reducing terminal (exopectinase) Exo-poly- EC 3.2.1.82 Cleavage of α-1,4 bonds from the Digalacturonate α-galacturonosidase non-reducing terminal α-L-rhamnosidase EC 3.2.1.40 Cleavage of α-L-rhamnose residues Rhamnose from the non-reduction terminal in α-L-rhamnosides * Data retrieved from [19,37,49,61,63,66,77] and curated with information from Enzyme nomenclature database from the SIB ExPASy Bioinformatics Resources Portal. # Lichenin is a glucan with β-(1,3;1,4) bonds, present in some types of lichens (e.g., in Cetraria islandica, Parmotrema spp., and Rimelia spp.). Lichenin is degraded by licheninases (EC 3.2.1.73), enzymes active on β-(1,3;1,4) bonds [78], and by glycosidases active on β-1,4 bonds, like cellulases, and glycosidases active on β-1,3 bonds, like endo-1,3(4)-β-glucanases (EC 3.2.1.6)8. † Cellobiose is a disaccharide of glucose with β-1,4 bonds.

Lactose can also be cleaved by the action of microbial β-galactosidases (EC 3.2.1.23)9, which are enzymes that catalyze the release of galactose residues from β-D-galactosides. β-galactosidases may also function as transgalactosylases, producing galacto-oligosaccharides, a type of prebiotic [79,80].

Dextrans are glucans α-1,6 links between glucose residues, and branched chains from α-1,4 links (α-1,2 and α-1,3 bonds may also be present) [72]. Dextrans are naturally occurring exopolysaccharides produced by, among others, bacteria in the genera Lactobacillus, Leuconostoc, Streptococcus, and Acetobacter [81,82]. Depending on the cleavage site, hydrolysis of dextrans yields dextran-oligosaccharides, isomaltotriose or glucose. The most important enzymes acting on dextrans are: endodextranases (EC 3.2.1.11); glucan 1,6-α-glucosidases (EC 3.2.1.70); glucan 1,6-α-isomaltosidases (EC 3.2.1.94); dextran 1,6-α-isomaltotriosidases (EC 3.2.1.95); and branched-dextran exo-1,2-α-glucosidases (EC 3.2.1.115) [81].

1.4.4. Peptidases Peptidases (EC 3.4) catalyze the hydrolysis of peptide bonds in proteins and other peptides [83,84]. Peptidases are divided into exopeptidases, when acting at the ends of the peptide chain, leading to the

8 Enzyme entry: EC 3.2.1.6, https://enzyme.expasy.org/EC/3.2.1.6. 9 Enzyme entry: EC 3.2.1.23, https://enzyme.expasy.org/EC/3.2.1.23.

8 release of amino acids, or dipeptides, and into endopeptidases, when acting in the interior of the peptide chain, producing oligopeptides. Exopeptidases are further divided into aminopeptidases (EC 3.4.11.-), if the released amino acid is from the N-terminal, and carboxypeptidases (EC 3.4.16.- to EC 3.4.18.-), if acting on the C-terminal of the peptide [84]. Endopeptidases are divided by their catalytic mechanism: aspartic peptidases (A), cysteine peptidases (C), glutamic peptidases (G), metallopeptidases (M), asparagine lyases (N), peptidases of mixed catalytic type (P), serine peptidases (S), threonine peptidases (T), and peptidases of unknown mechanism (U) [26].

1.4.5. Ureases Urea is synthetized as an end product of protein catabolism, by the deamination of amino acids10, due to the high toxicity of ammonia [85]. Urea can be hydrolyzed carbon dioxide and ammonia by urease (EC 3.5.1.5) [85], following the reaction11:

NH2CONH2 + H2O ⇌ CO2 + 2 NH3 (4) Urease from jack bean seeds was the first crystallized enzyme, which led to the conclusion that enzymes have a proteinaceous nature [86,87], and later on, to the discovery of a biological role for nickel, as urease was the first described nickel-dependent metalloenzyme [88,89].

1.5. Biotechnological and industrial application of relevant hydrolases

Esterases are used in the synthesis of optically pure compounds [41,90], a type of compounds important in the production of enantiopure medication. Esterases can also be used to recover phenolic compounds from nonwood plants. Phenolic compounds, like ferulic acid, sinapic acid, caffeic acid, coumaric acid, vanillic acid, and vanillin, are highly valued for food, beverages, pharmaceuticals, perfumes, and cosmetics [41,90,91]. After the extraction of the phenolic compounds from nonwood plants, the resulting pulp can then be used in paper production. This approach allows the use of agricultural wastes for paper production, with an added value, and cleaner wastes [91]. Esterases may be useful in the degradation of polymers, pharmaceuticals, and insecticides. An esterase from a compost metagenomic library was found to be active on ketoprofen, a nonsteroidal anti- inflammatory drug, and in polyurethane [92]. More recently, in 2018, Gangola et al. [93] proposed a new mechanism for the degradation of cypermethrin, a synthetic pyrethroid insecticide, by the use of laccases and esterases in a strain of Bacillus subtilis. Cold-active esterases from psychrophiles, i.e., microorganisms with optimal growth at 15 °C or lower [94], may be of interest for the production of short-chain esters, such as in the development of flavors in the food industry [95]. Thermostable esterases from several thermophiles, i.e., microorganisms with an optimum growth at 45 °C or higher [94], can be cloned in or in Pichia pastoris [96–99]. These esterases exhibit highly desirable physical and chemical characteristics (high temperature stability, alkaline pH stability, resistance against denaturant agents, and/or against organic solvents) for their use in detergent formulations, for the

10 NCBI. PubChem Database. Urea, CID=1176, https://pubchem.ncbi.nlm.nih.gov/compound/urea. 11 ENZYME entry: EC 3.5.1.5, https://enzyme.expasy.org/EC/3.5.1.5.

9 degradation of environmental contaminants, and in biotransformation12 [97,98].

Lipases are used to emulsify fats and oils in food, pharmaceuticals, and cosmetics, to remove pitch (resins) from wood in papermaking, in detergent formulations, and in leather processing [13]. Lipases are also important for biodiesel production, in the production of cheese and fermented meat, in diagnosis, as biosensors, for biodegradation, in the control and treatment of oil spills, and in polymer synthesis [15,100,101]. The deep-frying process, due to the high temperatures, may lead to the production of toxic substances. Recently, the transesterification reaction from a lipase from orange wastes was found to be able to reduce the toxicity of these wastes, confirming that frying oil can be bioremediated [102]. Cold-active lipases are very useful in the production of chemicals that cannot withstand high temperatures. Like other lipases, cold-active lipases can be applied in detergents, food processing, waste treatment, and pollution control [103].

DNases are used extensively in diagnostics, and for forensic purposes [42,104]. A subtype of DNases, the restriction endonucleases, active on specific nucleotide sequences, are important tools in biotechnology and research [8,12,42], as they are essential for recombinant DNA technology [18].

Amylases are used in the food industry to produce glucose syrups, maltodextrins, and modified starches, from starchy crops [105], in the conversion of starch into fermentable sugars in brewing, and in bioethanol production [106], for juice clarification, and for syrup viscosity reduction [19]. Amylases are also important in the textile industry, in the pulp and paper industries, in pharmaceuticals, in baking, in the production of feedstuffs, to increase digestibility, are incorporated in detergents in the removal of starchy stains, and in waste management, where they can be used to degrade vegetable wastes for compost production [19,65,107,108]. are used in the production of high-maltose and high-fructose corn syrups, associated to α-amylase in detergents, and in the production of cyclodextrins (cyclic oligosaccharides used in foods, pharmaceuticals, and cosmetics) [47,109]. can be used in the production of high-glucose syrups from starch, in the production of cyclodextrins, maltose, and maltitol, and in detergent formulations [110].

Cellulolytic enzymes are used for cotton treatment, in defibrillation of lyocell obtained from wood pulp, in papermaking, in the production of fermentable sugars from pulp and other cellulosic wastes, and in detergent formulations [54,111,112]. Hemicellulases are used to increase the quality of dough in baking goods [54]. Mixtures of cellulases and hemicellulases are used in fruit and vegetable juices extraction, and to increase digestibility in animal feed, by eliminating anti-nutritional factors [54,113]. Lignocellulosic materials from agricultural wastes (cellulose, hemicelluloses, and lignin) are important renewable sources of biomass [114]. The hydrolysis of lignocellulosic materials yields fermentable sugars (glucose, xylose, arabinose, mannose, galactose, and rhamnose) that can be used for bioethanol production [114,115]. Lignocellulosic biomass may additionally serve as sources of sugar alcohols, including sorbitol, used extensively as a sweetener in foods, and as humectant in cosmetics [114]. These materials can also be

12 Biotransformation is the transformation of an organic compound inside an organism, using the intracellular enzymatic machinery; it is also referred as whole cell biocatalysis [64].

10 used in the production of polyhydroxyalkanoates, a type of polyester that can be used as an alternative to petrochemical plastic, in a process that requires several types of cellulases [116].

Pectinases are used for juice clarification, to reduce viscosity in juices and fruit purees, in tea and coffee processing, in the saccharification of vegetable wastes for bioethanol production, for bioscouring (removal of impurities from cellulose-derived textiles, like cotton and rayon), for biobleaching in kraft pulp, and for deinking in paper recycling [54,63,66]. Pectin methylesterases and pectin acetylesterases are used to remove, respectively, methyl and acetyl groups from pectin [117]. Acetylxylan esterases are used to remove acetyl side-groups from xylan, allowing a more effective use of xylanase [118–120]. Enzyme preparations of cellulases, hemicellulases, and pectinases are used in the maceration and oil extraction from olives, for grape maceration in winemaking, and in research, to produce protoplasts from plants and fungi cells [54,113]. In turn, agriculture wastes rich in pectin, like wheat bran, can be used in the production of , allowing the production of a value added product from a waste [121].

β-glucosidases are used in the conversion of inactive glycosidic isoflavones in soy-based foods into their active aglycone counterparts, and in the improvement of food and beverage flavor, by releasing aromatic compounds from glycosidic precursors [115].

β-galactosidases are used extensively in the dairy industry to hydrolyze lactose from milk and other dairy products [117,122–124]. β-galactosidases can be used as a digestive aid, prior to the consumption of products with lactose, and in the production of galacto-oligosaccharides, for incorporation in foods [125]. β-galactosidases are also used to release galactose residues from xyloglucans and pectins [117].

Agarases are used to produce agar-derived oligosaccharides as food additives, and for incorporation in cosmetics, due to their antioxidant activity. Agarases are also used in research to prepare protoplasts from red algae, and in DNA recover from agarose gel [74,75].

Dextrans have several medical and pharmaceutical applications, and are extensively used in the food industry as thickening agents [82]. In some industries, such as in sugar production, the appearance of dextrans is undesirable, due to the increased viscosity in the sugar mother liquor, leading to production losses. To prevent these losses, dextranases are widely used in sugar production [126]. Dextrans produced by cariogenic bacteria (e.g., bacteria in the genera Lactobacillus and Streptococcus) enable the development of dental plaque, which may lead the appearance of dental caries [127,128]. Dextranases are used for caries prevention in mouthwash formulations, by acting as a biofilm production inhibitor [129].

Proteases are one of the most important types of enzymes in industrial applications, accounting for about 60% of the world market [83,84,130]. Microbial proteases are used in the dairy industry, for enhancing nutritional value and digestibility in feedstock, to reduce allergenicity in food processing, as digestive aid, in pharmaceuticals, for diagnostics, in peptide synthesis, in waste treatment, namely in the degradation of proteinaceous animal byproducts, like feathers, fur and fish scales, in leather treatment, for silver recovery from X-ray and photographic film (by hydrolyzing the gelatin coating in the film), as detergent components (for clothes, contact lenses, and dentures), and for research purposes [15,19,65,83,84,117,130–132].

11 Cold-adapted proteases, from psychrophiles, can be very useful when the manufacturing processes require lower temperatures, like cheesemaking and other food items production, for detergents, and in the treatment of proteinaceous waste at low temperatures [133].

Ureases are used in the production of alcoholic beverages, like wine, to prevent the appearance of ethyl carbamate, a potential carcinogenic, from the reaction between urea and ethanol [85,117,134–136]. Ureases may also be applied in industrial wastewater treatment as a cost-effective way to remove urea, as an alternative to high-temperature high-pressure treatments [135,137]. In medicine, ureases are used for urea removal in dialysis [135], in the treatment of hypertension, and in chemotherapy, as the anticancer agent derived from the jack bean L-DOS47 [85]. Ureases are very sensitive to the presence of heavy metals, a feature that can be used for the detection of these type of elements in environmental samples, or directly on the environment, using urease-based biosensors [135]. Biosensors to detect pollutants and for clinical applications have also been developed, and can function as low-cost alternatives to conventional methods [85,135]. Ureases are very unstable enzymes, which may limit their use. This problem can be overcome by enzyme immobilization [85]. Immobilized urease (in silica, tungsten, polytetrafluoroethylene, or acrylamide) can be used for urea content determination in fluids (blood serum, urine, water, wastewater, food, and wine). Novel urease immobilization methods allow higher stability, activity, shelf life, and enzyme reutilization [9,135]. Urease activity is associated with pathogen virulence (e.g., urease production in is regarded as essential for bacterial colonization of the gastric mucosa); and effective inhibition of the urease activity could represent a way to control infections by ureolytic bacteria. Urea is frequently used in crop fertilization as an affordable nitrogen source. Urease inhibitors may also be used in agriculture to control the hydrolysis of urea by soil microbiota, to prevent plant damage from ammonia toxicity [135,138].

Supplementary Table 2 in Appendix B shows a summary of some of the most relevant biotechnological and industrial uses of commercial hydrolases discussed in this thesis.

1.6. Screening for novel enzymes

Evidence supports that enzymatic profiles seem to be associated with microenvironments, possibly due to available nutrients, or by natural selection driving forces [139]. Several screening studies aiming to find new enzymes were conducted using samples from locals where the targeted enzymes had a higher probability to be found. Proteases were isolated from marine wastes [140], and from soils [141]. New keratinases were found in places with feather residues [142–144], and in deteriorated leather [145]. Lignocellulose degrading enzymes were isolated from paper mill sludges [146], from bacteria associated with invertebrates known to eat cellulosic materials, like termites and bookworms [147], and from soil and wastes in sugar mills [148]. Pectinases were isolated from decaying fruits and vegetables [149]. Starch degrading enzymes were isolated from soils used to produce starchy crops, like potatoes [150], and from cassava processing sites [151]. Extremozymes are isolated from extremophiles [37,152]. Novel biochemical pathways have been found in bacterial isolates from wastewater bioreactors [153], and bacteria able to degrade hydrocarbons were isolated from oil contaminated sites [154,155].

12 The most frequently referred methods for the screening of novel enzymes include: catalysis assays with specific substrates; assays with chromogenic or fluorogenic substrates; assays with fluorophores; enzyme fingerprinting; microarrays; mass-spectrometry-assisted enzyme-screening; genome sequencing, and subsequent analysis of enzyme-encoding genes; and metagenomic approaches [19,156–159]. Some of these techniques have been optimized for high-throughput screening [156,160,161].

Enzyme profiling, or enzyme fingerprinting, allows differentiation between enzymes with very similar activity, by measuring enzymatic activity across several substrates [157]. Activity-based protein profiling (ABPP), a valuable technology in proteomic studies [162], has been used for enzyme screening in living extremophilic , and can be useful in metagenomic libraries studies [163]. Metagenome approaches, like metagenomic screening, metagenome expression libraries, and genome mining, are used in enzyme screening in nonculturable bacteria and archaea [13,19].

1.7. Enzymes of microbial origin

Microorganisms comprise all unicellular life forms, including all Bacteria and Archaea, and many eukaryotic organisms, including algae, protists, and fungi [94]. Microorganisms are highly diverse and can be found in all niches on the planet where life is present [164]. A recent tree of life, based on genome sequencing, emphasizes this diversity, with 92 named phyla in the Domain Bacteria alone, representing the vast majority of life diversity [165]. Up until the 1970s, enzymes for industrial applications were primarily of plant or animal origin. The rise on enzymes demand, and new fermentation techniques, led to the exploration of microorganisms as a novel source of enzymes for biotechnological applications [9]. An important source of novel enzymes are extremophiles, organisms that live under extreme conditions (pH, temperature, pressure, salinity, oxidative stress, radiation), that have established survival strategies, which can result in a wide variety of enzymatic systems. Extremozymes are generally active under harsh conditions, and are compatible with many manufacturing processes [19,166].

Enzymes of microbial origin are economically competitive. Microorganisms have high growth rates, even when growing in inexpensive culture media. Microbial enzymes can be easily produced in reactors, with batch fermentation periods generally falling between 30 and 150 h for maximum yield [9,13]. Enzyme production can be easily optimized, by selecting improved microorganisms, or by controlling environmental conditions in bioreactors. This way, microbial enzyme production is not dependent on environmental conditions, and time of the year, as it happens for several enzymes derived from plants or animals. Furthermore, enzymes from microorganisms are easier to extract. Many microbial enzymes are secreted to the exterior of the cell, which leads to fewer extractions and purification steps during industrial enzyme production. Intracellular enzymes (endoenzymes) of microbial origin are also easier to obtain than their plant or animal counterparts, and, generally, show high stability [9,13,19,167].

The progress in the recombinant DNA technology further increased the potential for microbial use in enzyme production [65,167], as it has made possible the expression of heterologous proteins. Applying this type of technique in microorganisms brings fewer ethical concerns than in plants or animals [9].

13 Recombinant DNA technology may lead to higher yields, and to the production of enzymes with enhanced properties [13]. Improved enzymatic activity may also be obtained by directed evolution experiments [168], random mutagenesis [14], and active site engineering [169]. Computational methods, like the Adaptive Substituent Reordering Algorithm (ASRA), are valuable tools for enhancing the efficiency of directed evolution [170,171]. Also possible is the reorganization of enzymes into new biosynthetic pathways [172].

1.8. Bacteria identification for biotechnological applications

Most microorganisms with which humans come into contact can be considered commensals. They intervene in the metabolism of food products and help the host to respond against infections [173]. Some microorganisms, termed pathogens, are causative agents of diseases. Microorganisms that are always associated with disease are called strict pathogens. Most microorganisms associated with diseases, though, can be found in the host’s normal microbiota, or in the environment, only causing disease when certain conditions, like a weakened immune system, are met. These are called opportunistic pathogens [173].

Pathogens (whether strict or opportunistic) cannot be used in the production of goods that are intended for human or animal consumption. Enzymes involved in the food and feed industry should be produced from organisms (animals, plants, yeasts, filamentous fungi, and bacteria) generally recognized as safe13 (GRAS) [167], whose assumption of safety is recognized by the United States Food and Drug Administration, or by microorganisms that have been granted qualified presumption of safety14 (QPS) status, by the European Food Safety Authority. GRAS and QPS microorganisms are a valuable source for new enzymes. When a desirable enzyme is found in organisms that lack one of these status, it can be expressed as a recombinant enzyme, in a suitable class 1 organism [174]. Due to the desirability of using GRAS enzyme-producing microorganisms in the food, feed, and beverage industries, the identification of the desired microorganism is an important step in enzyme prospection.

Traditionally, bacteria identification relied on a polyphasic approach, including morphological and metabolic characterization, and in the application of molecular techniques, like DNA-DNA hybridization (DDH) [175], but this methodology is often time-consuming. New approaches for faster identifications have been developed, including 16S ribosomal RNA (rRNA) gene sequence comparison against curated sequences of known species [176–178]. The small subunit ribosomal RNA (ssrRNA) sequence comparison was on the base of the first tree of life, in 1990, that divided all biodiversity into three domains, Bacteria, Archaea, and Eukarya [179,180]. Since the 1990s, 16S rRNA gene sequence comparison is increasingly used for bacterial identification, and several studies were conducted to associate 16S rRNA gene sequence homology with DDH [181].

The 16S rRNA gene is nowadays the most important housekeeping gene for bacteria identification, and it is used as a molecular chronometer in phylogenetic analysis of bacteria. This is due to: 1) the 16S rRNA gene is present in all bacteria; 2) its function has remained constant during evolution and it is essential in

13 FDA. Generally Recognized as Safe, https://www.fda.gov/food/ingredientspackaginglabeling/gras/. 14 EFSA. Qualified presumption of safety, https://www.efsa.europa.eu/en/topics/topic/qualified-presumption-safety-qps.

14 messenger RNA (mRNA) translation, being an overall highly conserved gene with some variable regions; 3) with approximately 1500 base pairs (bp), it is considered large enough that differences in nucleotide sequence can be statistically relevant; and 4) it is easily amplified and sequenced [176–179]. Moreover, there are several 16S rRNA gene dedicated databases with curated sequences of known species that allow the comparison of a sequence of an unknown bacteria with the sequences of the database, for rapid identification, at least at the genus level, and, in many cases, at the species level [176,178]. Nevertheless, the 16S rRNA gene may have low-resolution power for closely related bacteria, which may truncate the identification at the genus level, and only phylogenetic analysis with complete or near-complete 16S rRNA gene sequences provide reliable phylogenetic relationships [175]. The 16S rRNA gene is also used to infer the phylogeny of a given bacteria, even when its complete identification is not possible. In these cases, a phylogenetic analysis may allow to determine the closest relatives of the unidentified bacteria.

Up until 2006, two gold-standards were considered in bacterial : DDH and 16S rRNA gene homology

[181]. Two bacterial strains with less than 70% of DNA-DNA reassociation or more than 5 °C of ΔTm (as recommended in 1987 [182]), were considered as belonging to different species. In terms of 16S rRNA gene nucleotide sequence homology, the standard for identifying an isolate as belonging to a determined species was established as 97% of 16S rRNA gene sequence homology [175,181,183], and 95% homology for genus allocation [184]. The percentage for species differentiation was raised to 98.7% in 2006 [181], but eight years later, in 2014, the accepted value was lowered to 98.65% [185]. Nonetheless, several studies have shown that the use of these values (either 98.7% [184,186] or 98.65% [183]) in the identification of some bacteria should be met with caution, as its use alone may lead to bacteria misidentification.

Other genes also routinely used in bacterial identification include the house-keeping genes for DNA gyrase (gyrA and gyrB), or the genes for the bacterial RNA polymerase (rpoA, rpoB, rpoC, and rpoD) [187]. Newer approaches include the analysis of several single-copy phylogenetic marker genes (pMGs), mainly encoding ribosomal proteins, to delineate prokaryotic species [188–190].

1.9. Bioremediation – microorganisms and enzymes 1.9.1. Green engineering, bioremediation, biostimulation, and bioaugmentation The progress in green chemistry, and the increasing concern towards environment protection led to the development of the green engineering, defined as “the design of systems and unit processes that obviate or reduce the need for the use of hazardous substances while minimizing energy usage and the generation of unwanted by-products” [191].

Bioremediation is one application of the green engineering [192]. In bioremediation contaminants are removed using the biodegradative processes of organisms (bacteria, fungi, algae, or plants), a potentially cost-effective alternative to more traditional ways of waste disposal, like landfilling or incineration [15,193]. Most bioremediation processes occur in situ, because it reduces the likelihood of contaminant spreading, and due to the costs associated with removing contaminated soil or water, but ex situ bioremediation may pose some advantages, such as the control of the growth conditions of the involved microorganisms [192].

15 As bioremediation is frequently a time-consuming process [192], one of the methods to improve the efficiency of in situ bioremediation is by biostimulation, by adding limiting nutrients that, being in short supply, might be hindering microbial growth of the natural degradative population already in situ [193,194].

Most times the contaminants are not effectively degraded by the native microbiota [195]. In these cases, it is frequently used an external inoculum of especially selected microorganisms to expedite the degradation of the contaminants. This process is called bioaugmentation [193–195].

1.9.2. Bacterial consortia In most treatments that resort to bacteria to help biodegradation, the bacteria do not appear isolated, but in flocs or in biofilms [196]. Growing in a consortia may help bacteria to survive conditions that isolated bacterial cells would not be able to withstand [197], and may lead to increased efficiency in resource utilization and productivity in the community [198].

In recent years, microbial consortia have been studied in agriculture [199–201], waste treatment [146,202], pollution control in oil and oil derivatives contaminated sites [203–207]; in the degradation of recalcitrant substrates, like herbicides and polycyclic aromatic hydrocarbons [208,209], and in biomass recycling [210,211].

1.9.3. Enzymes in bioremediation When microorganisms alone are not able to degrade the contaminants, it is possible to use enzymes to expedite the degradation process. Enzymes may be active when the abiotic conditions are not favorable for microbial growth (temperature, humidity, salinity, pH), or when high concentration of toxins are present. In these scenarios, enzymes pose an alternative to other, often more toxic and hazardous, chemical processes [14].

The use of enzymes in bioremediation has several advantages: 1) due to their high specificity, enzymes produce very few byproducts; 2) being proteins, enzymes are biodegradable [14]; 3) the enzymatic activity can be easily controlled by the use of inhibitors, allosteric regulators, changes in temperature, or pH [9]; and 4) enzymes are easier to transport and maintain than live microorganisms [14].

1.10. Wastewaters

One example where bioremediation is very important is in wastewater treatment, as wastewaters are one of the major contributors for water pollution [212]. Wastewaters include any water that has been used in human activities. It is generally divided into two categories: domestic wastewater, resulting from household activities, and working offices; and industrial wastewater, derived from manufacturing facilities [212]. In turn, domestic wastewater can be divided into black wastewater, derived from toilet wastes, and grey wastewater, mainly from kitchen sinks, bathtubs, lavatories, and washing machines, even though, in most households, these two types of wastewater are disposed of together [213].

To prevent water pollution, wastewaters are generally collected and undergo some type of treatment to remove organic matter and pollutants. As of 2015, about 30% of the wastewater went untreated in high-income countries, but this value surpasses the 90% in low-income countries [214].

16 In broad terms, wastewater treatment is generally processed as a four step process: 1) preliminary treatment, where coarse solids and some oils are removed; 2) primary treatment, where settleable solids are removed; 3) secondary treatment, where organic matter is decomposed, mainly by aerobic microorganisms; and 4) tertiary treatment, to remove residual organic and inorganic substances, pollutants, and pathogenic organisms by disinfection [212,215].

One way to characterize wastewaters, in terms of organic matter, is through the biochemical oxygen demand (BOD). Other ways include, but are not limited to: the chemical oxygen demand (COD); total organic carbon (TOC); total solids, suspended, dissolved, and settleable; total nitrogen; and total phosphorous [215,216]. BOD corresponds to the amount of oxygen required to microorganisms to degrade the organic matter in an aqueous sample in the presence of oxygen. This measure is frequently used to assess the organic matter load in wastewaters, and to evaluate the quality of the treatment in wastewater plants [217,218]. BOD can also be used to assess the biodegradability of a substance or effluent.

In terms of composition, the major component in wastewaters is water, comprehending roughly 99.9% (by weight) of the totality of the wastewater. The remaining 0.1% are solid components, comprehending proteins, carbohydrates, and lipids [212]. In a typical domestic effluent, the BOD (5 days, 20 °C) varies between 560 mg/l (low water consumption and/or low infiltration scenarios) and 230 mg/l (high water consumption and/or high degrees of infiltration scenarios) [219]. Proteins correspond roughly to 40% of the total organic matter, the amount of carbohydrates varies between 25 and 50%, and oils correspond to about 10% [215].

Urea, which represents more than 50% of the total organic solids in urine, is an important net contributor to the nitrogen content in wastewaters with human or domestic farm mammals excreta (the other being ammonia, from protein degradation from feces) [220]. On the other hand, cellulose, essentially from toilet paper, is one of the most important contributors for the BOD in domestic wastewater [221,222], and can affect the normal function of wastewater treatment plants (WWTPs), a problem that has been studied at least since the 1960s [223]. Cellulose is also one of the major constituents in wastewater from paper mills [224].

1.11. The BioTask Bioremediation Culture collection

The BioTask Bioremediation Culture (BBC) collection is a cryopreserved collection of 202 man-made environment naturally occurring bacterial strains15, kept at Lab Bugworkers, Faculty of Sciences, University of Lisbon (FCUL), and established as a part of a collaborative research between BioISI/FCUL and BioTask on an ongoing doctoral program. The main purpose of establishing the collection was to isolate bacterial strains with potential biotechnological application, particularly in bioremediation [225].

For the collection construction, samples were collected in Portugal, from October 2014 till April 2016, from sources with severe human impact. All isolates were characterized by routine phenotypic tests: cell morphology (rods or cocci), cell wall type (Gram staining and KOH test), and and oxidase (cytochrome c oxidase) activities. The BBC collection diversity was assessed by polymerase chain reaction (PCR) fingerprinting, employing two distinct methodologies: 1) microsatellite/minisatellite primed PCR

15 The strains are preserved in axenic cultures at -80 ºC, in Brain-Heart Infusion Broth (BHI; BIOKAR Diagnostics, France), with 20% v/v glycerol.

17 (MSP-PCR), using the primer csM13 (5’-GAGGGTGGCGGTTCT-3’) [226], a core sequence of the M13 bacteriophage used for DNA fingerprinting in all three domains of life [227–232]; and 2) random amplification of polymorphic DNA (RAPD), with the universal primer PH (5’-AAGGAGGTGATCCAGCCGCA-3’) [233–235]. Strains were divided into four groups, depending on cell morphology and Gram staining results: Gram-positive rods, Gram-positive coccus, Gram-negative rods, and Gram-negative coccus. Relationships among strains were analyzed using Pearson product-moment coefficient (also known as Pearson correlation coefficient) [236] for similarity, and unweighted pair group method with arithmetic average (UPGMA) [237], a sequential agglomerative hierarchical non-overlapping clustering technique (SAHN), for hierarchical clustering. Data was treated to generate four composite dendrograms, one for each of the above-described groups (see Supplementary Figure 1 in Appendix C).

1.12. Thesis scope

The main objectives of this work were the search of novel bacterial hydrolase producers in the newly constituted BBC collection, and to study their possible biotechnological and/or industrial application. Secondary objectives included metabolic characterization, biomass production optimization, and identification of the studied strains. These objectives were framed by the fact that the BBC collection was created for its application in bioremediation, and that one of the most important applications of hydrolase-producing bacteria is in wastewater treatment. To achieve these objectives, a set of strains from the BBC collection was subjected to: 1) laboratory routine tests; 2) phenotypic characterization by the use of several substrates as sole carbon source; 3) growth analysis in distinct general-purpose culture media; 4) analysis of the influence of environmental factors in strain growth; 5) hydrolase screening; and 6) identification by partial 16S rRNA gene sequence analysis.

After a hierarchization of the strains in the working set (using the results from the above procedures), the study was then focused in the potential applicability of strains with promising enzymatic activity for the biological treatment of wastewater. Strains with fast urease activity were studied, both individually and in a consortium, for their potential application in the removal of ammonium and urea. Strains with positive results in the degradation of cellulose were studied for their use in the degradation of cellulosic materials.

18 2. Materials and Methods 1. Strain recovery, characterization, and selection 2. Reduction of the number of strains and selection of the working set 3. Characterization of the strains in the working set 4. Screening of hydrolytic enzymes 5. Strain identification and phylogenetic analysis 6. Case study – urea and ammonium removal for wastewater treatment 7. Case study – cellulose degradation for wastewater treatment 8. Workflow of the study

2.1. Strain recovery, characterization, and selection 2.1.1. Initial set of strains For practical reasons, screening an entire collection for enzymatic activities is not feasible. The adopted strategy was to select a representative initial set of strains, with about 20% of the BBC collection. This set of strains was selected to include representatives from each composite dendrogram from a previous diversity analysis performed over the BBC collection. See section 1.11 and Appendix C for more details.

2.1.2. Strain recuperation, growth conditions, and routine maintenance For all selected strains, a 1 µl-loop of cryopreserved culture was streaked onto Trypto-casein Soy Broth (TSB; BIOKAR Diagnostics, France), supplemented with 1.5% w/v of bacteriological agar (BIOKAR Diagnostics, France; Trypto-casein Soy Agar, TSA) plates, and incubated at 28 °C, for 24 h, or until visible growth.

In laboratory routine, strains were kept in axenic conditions on TSA plates, at 4 °C, and sub-cultured every four weeks (up to three times). Strains were recovered from the cryopreserved collection every four months, or when viability loss or contamination was detected in the refrigerated cultures.

2.1.3. Phenotypic confirmation For each strain in the initial set, strain phenotypic characteristics were confirmed by Gram staining, KOH test, and oxidase, and catalase activities. For cell morphology and Gram staining, cells were observed under brightfield microscopy, as described in [238]. KOH test was performed to further confirm cell wall type, and was carried out as described in [239,240], with a 3% w/v solution of potassium hydroxide. Catalase activity was detected by effervescence production with a 3% w/v solution of hydrogen peroxide, as described in [241]. Oxidase (cytochrome c oxidase) was detected using a 1% w/v solution of N,N,N',N'-tetramethyl-p-phenylenediamine (TMPD), as described in [241]. All phenotypic tests were carried out with freshly grown cells on TSA, incubated at 28 °C, for 18-24 h.

2.2. Reduction of the number of strains, and selection of the working set

For biotechnological applications, it is preferable to use strains with high growth rates, at the lowest possible cost. For this reason, the initial set was subject to a rapid growth screening aiming at a final

19 selection of strains (the working set), to focus the enzymatic activity prospection on the most promising candidates. A final set of strains that simultaneously showed high growth in TSB and/or in Nutrient Broth (NB; BIOKAR Diagnostics, France) and that represented the largest diversity possible (by including representatives of all major groups of strains) was selected from the initial set of strains.

2.2.1. Experimental methodology To reactivate strains, 10 µl-loops of refrigerated culture were grown at 28 °C, for 18-24 h, in 2 ml microtubes with 1 ml of TSB. After incubation, microtubes were centrifuged (10 000 rpm, 3 min, Beckman Coulter Microfuge 18 Centrifuge16), and the supernatant discarded. The resulting pellets were resuspended in 500 µl of sterile saline solution (0.8% w/v of sodium chloride). The resulting bacterial suspensions were used as inocula. To ensure that the inocula contained a reasonable amount of metabolic active bacteria, the suspensions were used within a period of 2 h. The assay was carried out with two media, TSB and NB, using 100-well honeycomb plates. Each well was filled with 270 µl of culture medium and 10 µl of inoculum. Wells were inoculated in triplicates. Plates were incubated for 48 h, at 30 °C, with continuous shaking, in an automated growth curve analysis system (Bioscreen C, Oy Growth Curves Ab). Wells with medium, but without inoculum, were incubated at each plate as blank controls. Optical density (OD) measurements (absorbance) for each well were performed at 600 nm [242,243], every 30 min. Inocula viability and purity was assessed by streaking TSA plates, incubated at 28 °C, for 18-24 h.

2.2.2. Data analysis During bacterial growth, the turbidity of a liquid culture increases in proportion with the number of bacteria (both viable and dead) in the suspension. The evolution of an OD curve over time, with respect to its initial value, can be used to derive the growth rate of microorganisms in liquid media [243], or to qualitatively compare the growth rate between cultures. One way to quantify the evolution of the OD, over a period of time, is by computing the corresponding area under the curve (AUC). This has the advantage of minimizing noise errors, and provides a numerical value that can be used to compare the growth of two cell cultures. Arguably, this approach has limitations. Two distinct curves may yield the same AUC value (e.g., a fast-growing curve with a low settling point, and a curve with a high settling point, but showing a considerable lag phase). Also, if cell lysis occurs, its AUC value may be similar to the one of a slow-growing curve where cell lysis does not occur. For this reason, when using AUC values (AUCs), special care has to be taken when drawing conclusions, namely taking into account lag phase and final settling value. In this work, AUCs for 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44 and 48 h of incubation were computed for both inoculated and blank wells (blank AUCs) using Microsoft Excel for Office 365 (v16.0). For this, relative OD (ODr) are computed by the subtraction of the corresponding initial OD value. The AUC for each well is then computed, from the corresponding ODr sequence, using first-order numerical integration (trapezoidal rule) [244]. To discard possible alterations in the media turbidity during incubation, the blank AUCs are subtracted from the AUCs from the inoculated wells. The resulting values are called the net area under the

16 10 000 rpm (revolutions per minute) are equivalent to 9 240 rcf (relative centrifugal force).

20 curve (NAUC). If a NAUC value is negative, it is assumed as zero. To outline major representative groups of strains, the results from the phenotypic confirmation procedure (cell morphology, Gram staining, oxidase, and catalase), and NAUCs (for all considered periods, and for both TSB and NB) were combined and subjected to a principal component analysis (PCA) [245], using NTSYSpc (version 2.21L, Applied Biostatistics). With this approach, phenotypic information, lag phase (by the introduction of the first few NAUCs), and settling values, are all taken into account in strain discrimination.

PCA, initially developed by Pearson in 1901 [246], and independently later, in 1933, by Hotelling [247], is a multivariate data analysis methodology used as a nonhierarchical ordination procedure to convert possibly correlated variables into a new set of linearly uncorrelated variables, by the orthogonal transformation of data into a new set of uncorrelated variables, named principal components (PCs). A PCA is applied in the reduction of the complexity of a data set while retaining as much variability as possible from the original data. The reduction is achieved by redistributing the variability in the PCs, ordered by percentage of variance, so that the first few PCs retain most of the total variance of the original variables [245]. Usually, it is considered that the first few PCs provide a good representation of the original data if they account for at least 75% of the original total variance.

A matrix of strains, or operational taxonomic units (OTUs), versus variables, was standardized using the standard score (by subtracting the population mean from an individual initial score, and then dividing the difference by the population standard deviation) [248]. The standardized matrix was then used for PCA. A correlation matrix was computed using Pearson correlation between every two variables, followed by eigendecomposition. The relevant explanatory variables were selected by analysis of the resulting matrix of eigenvectors. Only variables with a correlation coefficient inferior to -0.500 (negative correlation) or superior to 0.500 (positive correlation) were considered as explanatory. A projection matrix was computed from the standardized matrix and the eigendecomposition results. From the projection matrix, OTUs were plotted in two-dimensional space charts in the PCA space.

A dendrogram from the projection matrix was constructed using NTSYSpc, to improve the discrimination between strains. The dendrogram was generated using taxonomic distance [249] as association measure, and UPGMA as the hierarchical clustering method. The cophenetic correlation coefficient [249–251], computed from a two-way Mantel test [252], was used as a measure of the faithfulness of the dendrogram as a representation of the association matrix.

A final working set of 25 strains was selected for further analysis. For this set were selected representatives of all clusters from the dendrogram that showed rapid growth (with the highest NAUCs for 36 h of incubation, either in TSB, or in NB).

2.3. Characterization of the strains in the working set 2.3.1. Sole carbon source assay (metabolic characterization) 2.3.1.1. Experimental methodology The working set was screened for the ability to grow using various substrates as sole carbon source. A total of 23 substrates were used in the metabolic characterization: eight monosaccharides; two disaccharides, eight sugar alcohols, and five esters (see Table 5).

21 Table 5. Substrates used in the metabolic characterization. Monosaccharides Disaccharides Sugar Alcohols Esters Aldoses Ketoses Arabinose Fructose Cellobiose Arabitol Methyl Butyrate Galactose Sorbose Lactose Galactitol Tween 20 Glucose Glycerol Tween 40 Mannose Mannitol Tween 60 Ribose Myo-Inositol Tween 80 Xylose Ribitol Sorbitol Xylitol

Monosaccharides used here belong to the most prevalent form in nature: arabinose and sorbose belong to the L-series; the remaining monosaccharides, as well as cellobiose, arabitol, mannitol, and sorbitol, belong to the D-series. Strains were incubated in Yeast Nitrogen Base Without Amino Acids (YNB; Sigma-Aldrich, USA) with a final pH adjusted to 7.4 with phosphate buffer (0.4% w/v of potassium dihydrogen phosphate, 0.6% w/v of disodium hydrogen phosphate, and 0.04% w/v of sodium chloride) supplemented with 1% w/v of substrate as the carbon source. Strains were also inoculated in pH-adjusted YNB without an added carbon source, to ensure that the medium in the cultivating conditions did not allow bacterial growth. The assay was conducted in 100-well honeycomb plates, in triplicates, following the methodology described in section 2.2.1 for the rapid growth screening, with an incubation period of 36 h.

2.3.1.2. Data analysis Strain growth for each substrate was assessed by NAUC analysis. The NAUC for 36 h in TSB was used as reference for growth for each strain. Growth was considered negligible when the corresponding NAUC was below 5% of the NAUC for TSB, weak positive when equal or above to 5% and below 30%, positive when equal or above 30% and below 70%, and strong positive if equal or above 70%. When the NAUC was zero it was considered as no growth (NG). The utilization of substrates by each strain was assessed by category (aldoses, ketoses, disaccharides, sugar alcohols, and esters), by the mean relative growth for all substrates in each category.

2.3.2. Biomass production assay 2.3.2.1. Experimental methodology Strains were incubated at 28 °C for 120 h in four general-purpose media commonly used in microbiology: NB; TSB; LB Broth (Miller, Sigma-Aldrich, USA); and in Brain-Heart Infusion Broth (BHI; BIOKAR Diagnostics, France), in order to determine the best culture media for biomass production for each strain. The assay was performed in 96-well microplates, with 270 µl of liquid media. Wells were inoculated in duplicates, with 10 µl of inoculum each, prepared as described in section 2.2.1. Wells with medium, but without inoculum, were incubated at each plate, as blank controls. The OD at 595 nm was measured at 24 h intervals in a microplate reader (Zenyth 3100, Anthos). Inocula viability and purity were assessed by streaking TSA plates, incubated at 28 °C, for 18-24 h.

22 2.3.2.2. Data analysis Bacterial growth was assessed by NAUC analysis, as described in section 2.2.1. For each strain, the medium with the highest NAUC was considered as the most appropriate medium for biomass production.

2.3.3. Influence of environmental factors in strain growth 2.3.3.1. Experimental methodology The effect of temperature, salinity, and pH in bacterial growth was studied for each strain, following the same methodology as described for the biomass production assay, in section 2.3.2 (incubation in 96-well microplates, duplicates, blank controls, OD measurements at 24 h intervals, and inocula viability and quality assessment). In the temperature assay strains were incubated in TSB, at room temperature (circa 20 °C), 28 °C and 37 °C, for 72 h. In the salinity assay strains were incubated in a nutritive broth (1% w/v of tryptone, and 0.5% w/v of meat extract, pH 7.0 ± 0.2), and in nutritive broth supplemented with 1, 3, 5, 7, 9, or 10% w/v of sodium chloride, at 28 °C, for 120 h. In the pH assay strains were incubated in TSB adjusted from pH 3 to 11 (increasing in 2 pH units) with a 10 M solution of hydrochloric acid or a 10 M solution of sodium hydroxide, at 28 °C, for 120 h.

2.3.3.2. Data analysis Bacterial growth for all assays was assessed by NAUC analysis, as described in section 2.2.1. The most suitable temperature, salt concentration, and pH for each strain (out of the test conditions) were determined from the NAUCs at the end of each assay (72 h for temperature; 120 h for salinity, and pH). The NAUC for 120 h in TSB at 28 °C from the biomass assay was used as reference for strain growth for the salinity, and the pH assays. For these assays, growth was considered negligible if the NAUC for 120 h of incubation for each salt concentration and pH level, was inferior to 5% of the reference NAUC.

Results from the phenotypic characterization (sole carbon source utilization, biomass production, and influence of environmental factors in bacterial growth) were integrated and subjected to a PCA, and a dendrogram was constructed, following the methodology described in section 2.2.2, to group strains into clusters with similar physiological response.

2.4. Screening of hydrolytic enzymes

Enzyme prospection was carried out by function-based screening, recurring to three distinct methodologies: 1) search for growth in a media with a compound that acts as the sole carbon source, and simultaneously is a substrate for a determined enzyme; 2) search for the formation of halos or zones of hydrolysis, where the substrate is digested; and 3) search for alterations in the media (e.g. a color alteration in the medium due to an alteration of the pH, or to the production of a colored compound).

Disaccharide and ester utilization by bacteria as sole carbon source requires the production of hydrolases (glycosidases and carboxylesterases, respectively). This allows the use of the results from the sole carbon source assay with these types of substrate for hydrolase producer prospection. In this work, strains able to grow on cellobiose are, necessarily, producing β-glucosidases; strains able to grow on lactose are producing β-galactosidases; and strains able to grow on esters (methyl butyrate, Tweens 20, 40, 60, and 80) are producing carboxylesterases (non-specific esterases, and/or lipases). Hydrolase production was considered negligible

23 when the NAUC at 36 h for these substrates was below 5% of the NAUC for TSB at 36 h, positive when equal or above to 5% and below 70%, and strong positive if equal or superior to 70%.

Except for the tests in the sole carbon source assay referred above, and unless stated otherwise, hydrolase prospection was performed in 24-well microplates, with a total of 1.5 ml solid media. Plates were inoculated in duplicates, with 10 µl of inoculum (prepared as described in section 2.2.1), and incubated at 28 °C, for 48 h. Wells with medium, but without inoculum, were incubated at each plate, as blank controls. Inocula viability and purity was assessed by streaking TSA plates, incubated at 28 °C, for 18-24 h. Tests with soluble starch, carboxymethyl cellulose (CMC), dextran, xylan, and pectin were performed using a double layer technique, which allows better visualization of degradation zones. In this technique, only the overlaying layer includes, in its composition, the substrate for the enzyme being searched.

Enzyme activity was determined as described next. 2.4.1. Carboxylesterase activity Strains were incubated in Tributyrin Agar Base (Merck, USA), supplemented with 1% v/v of tributyrin. The tributyrin forms a suspension in the agar, turning the medium opaque. Production of enzymes with carboxylesterase activity leads to the appearance of a clear halo around the bacterial growth [117,253,254]. Results were divided into three categories: negative, when a halo was not detected; positive, when a small halo was visible around the bacterial growth; and strong positive, when the halo covered more than half the surface of the inoculated well.

2.4.2. DNase activity Strains were incubated in DNase Test Agar (Merck, USA) supplemented with 0.005% w/v of methyl green. This cationic stain produces a green complex with DNA, giving the medium a greenish color. DNase production leads to the appearance of a clear halo (amber colored) around the bacterial growth [255,256]. Results were divided into three categories: negative, when a halo was not detected; positive, when a small halo was visible around the bacterial growth; and strong positive, when the totality of the surface of the well had lost the greenish color.

2.4.3. Agarase activity Strains were incubated in pH-adjusted YNB, supplemented with 1% w/v of bacteriological agar (YNBA), that acted as sole carbon source. The appearance of a shallow depression in the agar surface around the bacterial growth is considered a positive result for agarase production [240,257].

In strains that do not show agarase production, assays conducted with YNBA supplemented with a determined enzyme substrate allow to test simultaneously for the ability of the strain to produce the required enzyme to hydrolyze the substrate, and its ability to grow with the products resulting from the hydrolysis of the substrate as sole carbon sources.

2.4.4. Dextranase activity Strains were incubated on TSA and YNBA, both supplemented with 1% w/v of dextran. After incubation, wells were flooded with Gram’s iodine for 15 min. Gram’s iodine turns the medium dark grey in the presence of dextran.

24 The production of dextranases leads to the appearance of a clear halo around the bacterial growth [258,259]. Results were divided into three categories: negative, when all the media was colored dark grey; positive, when a clear halo appeared around the bacterial growth; and strong positive, when the clear halo covered more than half the surface of the inoculated well.

2.4.5. Cellulase activity The prospection for cellulose degrading enzymes was performed in solid medium, using CMC as substrate, and in liquid medium, with chromatography paper (made of cotton cellulose). In the CMC test, strains were incubated on TSA and YNBA, both supplemented with 1% w/v of CMC. Following incubation, wells were flooded with a 1 mg/ml solution of Congo Red. After 30 min the plates were rinsed and flooded with 1 M of sodium chloride for another 30 min. Congo red forms a red complex with polysaccharides. CMCase (enzyme complex that hydrolyzes CMC) production leads to the appearance of an amber halo around the bacterial growth [260–262]. Results were divided into three categories: negative, when the media was colored red; positive, when an amber halo appeared around the bacterial growth; and strong positive, when the amber halo covered more than half the surface of the inoculated well. In the paper test, strains were incubated in pH-adjusted YNB and in TSB with submerged sterile discs (one for each well) of Whatman 3MM chr paper (Ø 0.5 cm). The plates were observed for paper solubilization, for up to 30 days [146,263].

2.4.6. β-glucosidase activity Strains were incubated on TSA supplemented with 1% w/v of ammonium iron(III) citrate, and 0.5% w/v of esculin. Hydrolysis of esculin by β-glucosidases produces glucose and esculetin, a compound that, when combined with the iron citrate, produces a dark brown complex, promoting a change in the color of the medium, from light amber to brown/black [240,264–266]. Results were divided into three categories: negative, when the media retained the original amber color; positive, when a dark brown color appeared around the bacterial growth; and strong positive, when the totality of the surface of the well acquired a dark brown/black color.

2.4.7. Xylanase activity Strains were incubated on TSA and YNBA, both supplemented with 1% w/v of xylan. Following incubation, wells were flooded with a 1 mg/ml solution of Congo Red. After 30 min the plates were rinsed and flooded with 1 M of sodium chloride for another 30 min. Xylanase production leads to the appearance of a yellow halo around the bacterial growth [267,268]. Results were divided into three categories: negative, when all the media was colored red; positive, when a yellow halo appeared around the bacterial growth; and strong positive, when the yellow halo covered more than half the surface of the inoculated well.

2.4.8. Pectinase activity Strains were incubated on TSA and YNBA, both supplemented with 1% w/v of pectin. Following incubation, wells were flooded with 0.1% w/v of ruthenium red and left to settle for 5 min. Ruthenium red binds to

25 pectin, turning the medium dark purple/red. Pectinase production leads to the appearance of a light purple/red halo around the bacterial growth [49,269,270]. Results were divided into three categories: negative, when all the media was colored dark purple/red; positive, when a light purple/red halo appeared around the bacterial growth; and strong positive, when the light purple/red halo covered more than half the surface of the inoculated well.

2.4.9. Amylase activity Strains were incubated on TSA and YNBA, both supplemented with 1% w/v of soluble starch. After incubation, wells were flooded with Gram’s iodine for 15 min. Gram’s iodine turns the medium purple-blue in the presence of starch. The production of amylases leads to the appearance of a clear halo around the bacterial growth [150,240,271]. Results were divided into three categories: negative, when all the media was colored purple-blue; positive, when a clear halo appeared around the bacterial growth; and strong positive, when the clear halo covered more than half the surface of the inoculated well.

2.4.10. Peptidase activity In the peptidase activity prospection two substrates were used: casein, a family of proteins that corresponds to 80% of the milk protein in cows’ milk [76]; and gelatin17, a protein derived from collagen, present in the skin, bones, tendons, and in the loose connective tissue of animals [272]. For the screening of casein degrading hydrolases, strains were incubated in Skim-milk Agar (5% w/v of skim-milk powder, 1% w/v of bacteriological agar). The casein in the milk powder forms an opaque suspension in the milk agar. The production of peptidases leads to the appearance of a clear halo around the bacterial growth [240,273]. Results were divided into two categories: negative, when all the media remained opaque; and positive, when a clear halo appeared around the bacterial growth. For the screening of gelatin degrading hydrolases, strains were incubated in Nutrient Gelatin (0.5% w/v of peptone, 0.3% w/v of meat extract, 12% w/v of gelatin powder). After incubation, plates were refrigerated for 2 h, in order for the existing gelatin to solidify [240]. This assay does not discriminate gelatinase A (EC 3.4.24.24), or gelatinase B (EC 3.4.24.35), from other peptidases able to hydrolyze gelatin. Results were divided into two categories: negative, when the medium returned to solid after refrigeration; and positive, when the medium remained liquid.

2.4.11. Urease activity Strains were incubated in Urea Agar (Base) acc. Christensen (Merck, USA), supplemented with 2% w/v of urea. Urease activity leads to the alkalization of the medium, due to the release of ammonia from the hydrolysis of the urea, which reacts with the water molecules in the medium to produce ammonium cations, leading to a color alteration in the medium, from yellow to bright pink [240,241,274]. Results were observed at 18 h of incubation, as prolonged incubation leads to false positives due to alkalization of the medium

17 According to the search engine in the enzyme database Brenda (https://www.brenda-enzymes.org/) there are more than 140 described peptidases that catalyze casein (from classes EC 3.4.11.-, EC 3.4.21.-, EC 3.4.22.-, EC 3.4.23.-; EC 3.4.24.-, and EC 3.4.25.-), and more than 100 described peptidases that are able to use gelatin as substrate (from classes EC 3.4.21.-, EC 3.4.22.-, EC 3.4.23.-, and EC 3.4.24.-).

26 from peptone decomposition. Results were divided into three categories: negative, when the medium remained yellow; slow positive, when the medium showed an intermediate color between yellow and pink or was observed a pink halo around the bacterial growth; and fast positive, when all the medium turned bright pink.

2.5. Strain identification 2.5.1. Total genomic DNA extraction Total DNA was extracted employing a modified version of the guanidium thiocyanate method [275]. Strains for DNA extraction were grown overnight on TSA at 28 °C. In a 2 ml microtube, cells were suspended in 250 μl of lysis buffer [50 mM tris(hydroxymethyl)aminomethane (Tris), 250 mM sodium chloride, 50 mM ethylenediaminetetraacetic acid (EDTA), 0.3% sodium dodecyl sulfate (SDS); pH 8]. An approximate volume of 100 μl of glass microbeads (Ø 425-600 μm) was added and mixed with a vortex (full speed, 2 min). The suspension was incubated in a dry bath (65 °C, 30 min), and mixed again in a vortex. Next, 250 μl of GES reagent (5 M guanidium thiocyanate, 100 mM EDTA, and 0.5% v/v sodium lauroyl sarcosinate, pH 8) were added and mixed by inversion, after which the suspension was incubated on ice for 10 min, in order to allow cell lysis to occur. After cell lysis confirmation by solution clarification, 250 μl of cold 10 M ammonium acetate was added to the suspension, and incubated on ice for another 10 min. After incubation, 1 ml of a mixture (24:1) of chloroform and isoamyl alcohol was added and mixed vigorously by inversion. Following a centrifugation step (10 000 rpm, 10 min), the supernatant was transferred to a new 2 ml microtube. An equivalent volume of ice-cold isopropanol was added and mixed by inversion. The suspension was again centrifuged (10 000 rpm, 10 min), and the supernatant discarded. The resulting pellet was washed with 1 ml of ethanol 70% v/v. After a final centrifugation step (10 000 rpm, 10 min), the supernatant was discarded, and the microtube was left to air dry at room temperature for at least 1 h. The resulting pellet was solubilized in 100 μl of TE buffer (10 mM Tris and 1 mM EDTA, pH 8.0), and stored at -20 °C until use.

DNA quality was assessed by horizontal gel electrophoresis. A sample of 5 μl from the extract was mixed with 1 μl of loading dye (0.25% w/v of bromophenol blue, 30% v/v of glycerol), resolved in a 0.9% w/v agarose gel in 0.5x TBE buffer (50 mM Tris, 45 mM boric acid, 0.5 mM EDTA), at 5 V/cm for 60 min, stained in a 0.5 µg/ml solution of ethidium bromide (EtBr) [242,276], and observed under UV light (254 nm). All reagents were bought from Invitrogen, UK.

2.5.2. Partial amplification and sequencing of 16S rRNA gene Molecular identification was based on partial 16S rRNA gene sequence analysis, which enables strain identification at least at the taxonomic rank of genus.

For each strain from the working set, the 16S rRNA gene was partially amplified using the forward primer PA/8F (5’-AGAGTTTGATCCTGGCTCAG-3’) [233,234,277,278] and the degenerate reverse primer 1392R (5’-ACGGGCGGTGTGTRC-3’, where R corresponds to A or C) [279–281]. This set of primers allows the amplification of hypervariable regions V1 to V8, and a small portion of region V9 [282]. For PCR were used: PCR buffer 1x; 2 mM of magnesium chloride; 1 μM of primer PA; 1 μM of primer 1392R; 200 μM of each deoxyribonucleotide triphosphate (dNTP); 1 U/50 µl of Taq polymerase (all reagents Invitrogen, UK); and 1 µl of

27 the solution with the DNA template. PCR cycling conditions consisted of an initial denaturation step of 3 min at 94 °C, followed by 35 cycles of 1 min at 94 °C (denaturation), 1 min at 55 °C (annealing), and 1 min at 72 °C (extension), and an additional final step of 3 min at 72 °C, for chain elongation. All reactions were performed using a Biometra Uno II thermocycler, in a final volume of 50 µl, made up with ultrapure DNase/RNase-free distilled water.

PCR products were mixed with loading dye and resolved in 1% w/v agarose gel in 0.5x TBE buffer, at 5 V/cm for 60 min. A mixture of 7% v/v of 1 kb plus DNA Ladder (Invitrogen, UK), 20% w/v of bromophenol blue, and 73% v/v of TE buffer, at pH 8, was used as molecular size marker. DNA bands were visualized under UV light (254 nm), after staining with EtBr, and digitalized in an Alliance 4.7 UV Transilluminator (UVITEC Cambridge), using Alliance software version 15.15 (UVITEC Cambridge).

DNA concentrations of PCR products were estimated by plotting the relative intensity of the target DNA bands against a band of DNA of a known concentration of the molecular size marker using NIH ImageJ18 [283,284].

Samples were commercially purified, and the reverse strand was sequenced by Sanger sequencing (SupremeRun from GATC Biotech, Eurofins Genomics), using primer 1392R.

2.5.3. Sequence analysis and phylogenetic reconstruction Sequencing results were subject to a similarity-based search against a Naive Bayesian Classifier, the Ribosomal Database Project (RDP)19 [285], that provides taxonomic assignments to the rank of genus, and against a quality-controlled database of 16S rRNA gene sequences, EZBioCloud20 [286]. Sequences were also compared with BLASTn21 (nucleotide-nucleotide basic local alignment search tool) [287] against 16S ribosomal RNA sequences (Bacteria and Archaea), and against Nucleotide collection (nr/nt) databases, hosted at the National Center for Biotechnology Information (NCBI) server, to determine the closest known relatives.

Following sequence comparison, phylogenetic reconstructions were produced for each strain, or group of close strains. For the phylogenetic analysis were used: the partial 16S rRNA gene sequences of each strain; sequences of the type strains of the species identified as the closest relatives; and, if applicable, close strains not yet assigned to a valid species. All sequences were retrieved from the nucleotide database hosted at NCBI. Multiple sequence alignment was achieved using CLUSTAL W [288] embedded into MEGA X22 [289]. Phylogeny reconstruction was achieved using the neighbor-joining algorithm [290], a agglomerative (bottom-up) clustering method that estimates the minimum evolution [291,292] by minimizing the length of the internal branches of the phylogenetic tree [293], with bootstrapping [294] of 1000 replicates, and using Jukes-Cantor [295] as substitution model (assuming equal base frequency and equal mutation rate). All sequence alignments and phylogenetic reconstructions were performed using MEGA X.

Finally, the sequence of each strain was compared against the sequences of the type strains of the closest relatives (species and subspecies) in the phylogenetic trees, using the BLASTn suite [296], to determine the percentage of homology in pair-wise alignment.

18 ImageJ is available for download at https://imagej.nih.gov/ij/. 19 RDP Classifier is hosted at https://rdp.cme.msu.edu/classifier/classifier.jsp. 20 EZBioCloud “Identify” Service is hosted at https://www.ezbiocloud.net/identify. 21 BLASTn is hosted at https://blast.ncbi.nlm.nih.gov/. 22 MEGA X software is available for download at https://www.megasoftware.net/.

28 2.6. Case study – urea and ammonium removal for wastewater treatment

Fast urease strains were studied for their potential application in the removal of urea, and ammonium, in wastewaters, using two methods: (1) a direct method, by assessing the bacterial growth over time; and (2) an indirect method, by analyzing the respiration debit over time.

2.6.1. Preliminary test Strains with fast urease results were subjected to a preliminary test to assess their ability to grow on synthetic wastewater with ammonium chloride or urea as sole nitrogen source. For the preliminary test, strains were incubated on TSA plates, at 28 °C, for 18-24 h. After incubation, inocula were produced from a 10 µl-loop of each culture suspended in 500 µl of sterile saline solution (0.8% w/v of sodium chloride). The test was performed in duplicate, in a 96-well microplate with 10 µl of bacterial suspension inoculated in 270 µl of each synthetic wastewater. The plate was incubated at 28 °C, for 72 h, and observed every 24 h for turbidity. Strains were considered to be able to grow in each synthetic wastewater if it became turbid within 72 h. Inocula viability and purity was assessed by streaking TSA plates, incubated at 28 °C, for 18-24 h. The synthetic wastewaters used in this study were based on the M9 medium from Elbing & Brent [297], made of 1% w/v of glucose, 0.6% w/v of disodium hydrogen phosphate, 0.3% w/v of potassium dihydrogen phosphate, 0.05% w/v of sodium chloride, 0.025% w/v of magnesium sulphate heptahydrate, and 0.1% w/v of ammonium chloride or 0.056% w/v of urea, as the sole nitrogen source. Both synthetic wastewaters have equivalent amount of nitrogen in their composition. The synthetic wastewaters were prepared with a higher amount of carbon source (glucose) than the M9 original formulation, with autoclaved tap water, and sterilized by filtration (using a 0.22 µm pore filter). The increased amount of glucose prevents the depletion of the carbon source in the medium during the test. Tap water was used as a source of calcium (optional in the original formulation), and trace elements, like nickel, indispensable for urease activity (urease is a nickel-dependent metalloenzyme, as discussed in section 1.4.5). Autoclaved tap water maintains the naturally occurring elements, but loses the volatile chlorine used in the disinfection of the tap water, that might hinder bacterial growth. Results were divided into three categories: negative, when the synthetic wastewater remained clear; positive, when the synthetic wastewater became slightly turbid; and strong positive, when the synthetic wastewater became completely turbid. Strains with high turbidity in the synthetic wastewater with urea as nitrogen source, and also able to grow in the synthetic wastewater with ammonium chloride, were selected for further assays.

2.6.2. Growth assay – urea versus ammonium chloride as sole nitrogen source 2.6.2.1. Experimental methodology The selected strains were inoculated on TSA plates in order to form a bacterial lawn, and incubated at 28 °C, for 18-24 h. After incubation, bacterial growth of each strain was recovered from the agar surface with a cell scraper and resuspended in 10 ml of sterile saline solution (0.8% w/v of sodium chloride). These suspensions were used as inoculum for the growth assay. A consortium suspension was obtained by the

29 mixture of 2 ml of each individual strain suspension. For the assay, 50 ml of synthetic wastewater was inoculated with 500 µl of each inoculum. From the final volume of inoculated synthetic wastewater, 1 ml was aseptically collected for viable cell count at the beginning of the experiment; the remaining volume was incubated aerobically in an orbital shaker incubator (Cassel, France), at 28 °C, 150 rpm, for 6 days. The growth assay was performed in triplicate, with the two synthetic wastewaters used in the preliminary test (ammonium chloride or urea as sole nitrogen source), in 100 ml glass media bottles. Screw caps were loosely capped in order to allow gas exchanges and covered with aluminum foil to prevent environmental contamination. Viable bacteria count was performed at days 1, 2, 4, and 6, from 1 ml samples, collected aseptically. Inocula viability and purity was assessed by streaking TSA plates, incubated at 28 °C, for 18-24 h.

Viable cell counts were assessed using the most probable number (MPN) technique [298]. The MPN technique allows to count viable cells or, more accurately, colony forming units (cfu), in contrast to simple OD readings, in which is not possible to separate viable cells from dying, or already dead cells (if cell lysis did not occur). The test is based on naked eye observation of positive/negative growth of serial dilutions. The three tube MPN from serial dilutions method used was adapted from [299,300], and performed in in 96-well microplates. Serial 10-fold dilutions (from 10-1 to 10-8) were performed with sterile saline solution (0.8% w/v of sodium chloride), in a final volume of 1 ml. From these, 10 µl of each dilution were inoculated in 200 µl of TSB, three wells per dilution. The microplates were incubated at 28 °C and observed after 18-24 h for medium turbidity due to bacterial growth. Wells were considered positive if the medium became turbid, and negative, if the medium remained clear. For each dilution was determined the number of positive wells. Wells with non-inoculated medium were used as blank controls in each microplate.

In order to assess if the cultures remained axenic (in the bottles with individual strains), or if all the strains in the consortia remained viable (possible due to the differences in the colony morphologies that originated the consortia), a 10 µl-loop of each sample collected for viable cell count was streaked in TSA plates and incubated at 28 °C, and observed after 18-24 h of incubation. This procedure was performed for every MPN determination.

2.6.2.2. Data analysis The MPN was determined from the corrected MPN tables, published by de Man in 1983 [301]. For the determination of the combination of positives in three consecutive dilutions, ranging from 0-0-0 to 3-3-3, two criteria were used: (1) criterium of the positive wells, where the lowest dilution with all positive wells (or the dilution with the highest number of positive wells) and the consecutive two dilutions are used; and (2) criterium of the negative wells, where the dilution with all negative wells (or the dilution with the highest number of negative wells) and the previous two dilutions are used. If the application of the two criteria resulted in three consecutive dilutions, the MPN index was given from the MPN table in [301]. The MPN index was then multiplied by the dilution factor of the less diluted well series of the combination of positives to give an estimate of the number of viable bacteria in suspension. If the application of the two criteria resulted in four consecutive dilutions, the MPN was computed for two sets of three dilutions (the first three and the last three dilutions); the average of the logarithm of the two MPN was computed, and the antilogarithm was rounded to two significant numbers. This number was then used as a gross estimate of

30 the MPN. If the application of the criteria resulted in five or more consecutive dilutions, the MPN for the replicate at that determined incubation time was not computed, and it is was considered undetermined. For each strain, at each measuring point, and for each synthetic wastewater (with urea or ammonium chloride as sole nitrogen source), the mean MPN was computed from the MPN of the triplicates.

2.6.3. Respiration assay – urea versus ammonium chloride as sole nitrogen source 2.6.3.1. Experimental methodology The respiration assay was carried out in a self-check measurement respirometric (manometric) system (OxiTop, WTW), developed for BOD testing and measurement. The OxiTop works as a closed system, in which the amount of oxygen initially available to respiration is limited to the already present in the beginning of the experiment. During respiration, bacteria consume oxygen and release carbon dioxide, which is absorbed inside the 510 ml amber BOD glass bottles by sodium hydroxide pellets. This process results in an increasing vacuum pressure that the system computes as mg/l BOD23.

Inocula for individual strains and for consortia were prepared, and tested, as described for the growth assay (see section 2.6.2). A volume of 75 ml of synthetic wastewater was inoculated with 1.5 ml of bacterial suspension. The inoculated synthetic wastewater was aseptically transferred to three BOD bottles (22.7 ml each). From the remaining medium, 1 ml was collected to perform a MPN test to assess the initial bacterial load. Respiration tests were performed in triplicate, at room temperature (circa 20 °C), with constant stirring, and with pressure readings at each 60 min intervals, over a period of 96 h, for both synthetic wastewaters. Tests for each strain (or consortium) with ammonium chloride as sole nitrogen source were performed in parallel (at the same time) with the tests with urea. At the end of the assay, MPN tests were performed for each BOD bottle to assess the final bacterial load (viable cells).

2.6.3.2. Data analysis In this test it was assumed that strains with higher/faster growth show a faster oxygen consumption. The difference in used oxygen over time was used to infer the difference in the growth between individual strains, between the consortium and individual strains, and between synthetic wastewaters. The AUCs at 24, 48, 72, and 96 h for each BOD bottle were used to quantify the evolution of the respiration over time, using first-order numerical integration. Mean AUCs were computed for each strain, in each media, at each incubation interval, from the AUCs of the respective triplicates. The mean AUCs were used to compare the respiration debits between the media with distinct nitrogen sources. The mean AUCs were also compared between strains (individual strains between each other, and individual strains versus the same strains in consortium), to determine the best candidate (individual strain or consortium) for urea and/or ammonium removal in aqueous solutions.

2.6.4. Urea and ammonium chloride tolerance assays The growth with increasing concentrations of urea, and ammonium was studied for the strain with the best overall results from the growth and the respiration assays.

23 OxiTop Respirometer Systems, http://www.globalw.com/products/oxitop.html.

31 2.6.4.1. Experimental methodology Inoculum was prepared, and tested, as described for the growth assay (see section 2.6.2). For urea tolerance, a volume of 500 µl of bacterial suspension was inoculated in 50 ml of unsupplemented TSB, and in 50 ml of TSB supplemented with 1.66%, 3.33%, 5.00%, 6.66%, 8.33%, and 10.00% w/v of urea. The assay was performed in triplicate, following the same methodology described for the growth assay (media incubated aerobically in an orbital shaker incubator, in loosely capped 100 ml glass media bottles, at 28 °C, 150 rpm, for 6 days, with viable bacterial counts performed at days 1, 2, 4, and 6). The same methodology was used for ammonium tolerance, using TSB supplemented with 3.00%, 6.00%, and 9.00% w/v of ammonium chloride (roughly corresponding, respectively, to the same amount of added nitrogen in TSB supplemented with 1.66%, 3.33%, and 5.00% w/v of urea).

2.6.4.2. Data analysis Data from the urea and ammonium chloride tolerance assays was analyzed using the same methodology as for the growth assay with urea as sole nitrogen source (see section 2.6.2).

2.7. Case study – cellulose degradation for wastewater treatment

Strains with strong positive results in the degradation of CMC were studied for their potential application in the degradation of cellulose in wastewaters by assessing the bacterial growth over time.

2.7.1. Experimental methodology Cellulose degradation was assessed by bacterial growth comparison. The assay was based on the premise that bacteria, or bacterial consortia, which are capable of cellulose hydrolysis, when grown in the presence of cellulose, uptake the products of cellulose hydrolysis (mainly glucose and cellobiose) and use them as carbon and energy sources. This uptake is then reflected as a higher number of viable cells in the suspension (bacterial growth).

For the cellulose degradation assay two substrates were used: a soluble cellulose, CMC; and an insoluble microcrystalline cellulose, Avicel PH-101. The assay was performed in 50 ml of 1/10 strength TSB (base medium)24, supplemented with 1% w/v of substrate (CMC or Avicel). Inocula, for individual strains, and for the same strains in consortium, were prepared, and tested, as described for the growth assay comparing growth with urea or ammonium chloride as nitrogen source (see section 2.6.2). A volume of 500 µl of each bacterial suspension was inoculated in 50 ml of base medium, and in 50 ml of base medium supplemented with 1% w/v of substrate (CMC or Avicel); 1 ml of the inoculated medium was aseptically collected to assess the concentration of viable cells at the beginning of the test, using the MPN method. The assay was performed in triplicate, following a similar methodology described for the previous growth assays (media incubated aerobically in an orbital shaker incubator, in loosely capped 100 ml glass media bottles, at 28 °C, 150 rpm). The assay was performed for 9 days, with viable bacterial counts performed at days 1, 2, 3, 5, 7, and 9.

24 The 1/10 strength TSB was chosen as base medium to perform the assay due to the fact that none of the strains from the working set were able to hydrolyze CMC when this substrate was the only carbon source available (see section 3.3.3).

32 2.7.2. Data analysis Data from the growth assays was analyzed using the same methodology in the growth assay with urea as sole nitrogen source (see section 2.6.2).

2.8. Workflow of the study

A workflow of the work developed in this study is represented in Figure 1.

Figure 1. Workflow of the study.

33 3. Results and Discussion 1. Rapid growth screening and reduction of the number of strains 2. Phenotypic characterization of strains in the working set 3. Enzyme screening 4. Identification of the strains in the working set 5. Case study – Urea and ammonium removal for wastewater treatment 6. Case study – Cellulose degradation for wastewater treatment

3.1. Rapid growth screening, and reduction of the number of strains

The initial set of 45 bacterial strains (selected as representatives of 202 strains of the BBC original collection) included: 7 Gram-positive rods; 11 Gram-positive cocci; 25 Gram-negative rods; and 2 Gram-negative cocci. All strains were able to grow in the assay conditions, in both TSB and NB.

Reduction of the number of strains was attained from the analysis of the PCA results. For the PCA, data from the phenotypic characterization and the rapid growth screening was integrated into a matrix of 45 OTUs versus 28 variables. The variables considered were: cell morphology, Gram staining, oxidase, and catalase; and NAUCs for 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, and 48 h of incubation, for both TSB and NB.

NAUC at 36 h of incubation in TSB showed very strong correlations (r > 0.979, p < 0.001) with all computed NAUCs from 24 till 48 h of incubation in TSB, but not for any of the computed NAUCs for NB; similarly, NAUC at 36 h in NB also showed very strong correlations (r > 0.969, p < 0.001) with all computed NAUCs from 24 till 48 h of incubation in NB, but not for any NAUCs in TSB (see Appendix D for detailed results). These results suggest that, for the strains of the initial set, the NAUC at 36 h for a determined strain can be used as a good predictor for its growth in incubation periods between 24-48 h.

The first two PCs represented collectively 77.7% of the total variance, with PC1 and PC2 amounting to 60.1% and 17.6%, respectively. Only the NAUCs for TSB and NB were found to be relevant explanatory variables. The two-dimensional plot of the OTUs in the PCA space defined by PC1 and PC2, did not show a clear separation between groups of strains. For this reason, a dendrogram was constructed to group strains into clusters. The cophenetic correlation coefficient for the dendrogram is 0.72 (two-way Mantel test). As this value is between 0.6 and 0.95 [249] the dendrogram can be used to reliably infer clusters.

Considering a cut-off value of 28% dissimilarity, the 45 OTUs were agglomerated into four clusters, with one cluster being a single-member cluster (see Figure 2). Cluster I encompasses three strains (BBC|003, BBC|008, and BBC|128) with high mean NAUC at 36 h for both TSB and NB, and low coefficient of variation (CV)25, an indication of similar response in these two media during incubation (see Table 6). All three strains from cluster I were selected for the working set. Cluster II, with 33 elements comprehends strains with medium mean NAUC values, with a low to medium dispersion (intermediate CV values). Cluster II aggregates more than 70% of all strains in the initial set.

25 The CV was used here because it gives a better perception of the dispersion of the population than the sample standard deviation alone: the higher the coefficient of variation, the higher the dispersion of the population, and the higher the variability in the population [254].

34 Cells Gram NAUC TSB NAUC NB BBC|003 rod - 42.15 42.23 BBC|003 I BBC|008 rod - 43.57 42.78 BBC|008 BBC|128 coccus + 52.01 49.72 BBC|128 BBC|005 rod - 37.76 31.58 BBC|005 BBC|046 rod - 33.51 31.05 BBC|046 BBC|161 rod - 32.20 31.27 BBC|161 BBC|034 rod - 29.15 26.97 BBC|034 BBC|021 rod - 36.11 31.96 BBC|021 BBC|170 rod - 34.84 32.05 BBC|170 BBC|023 rod - 36.61 37.72 BBC|023 BBC|032 rod - 32.97 34.49 BBC|032 BBC|028 rod - 28.93 32.99 BBC|028 BBC|187 rod - 25.67 32.99 BBC|187 BBC|031 rod - 35.47 31.96 BBC|031 BBC|029BBC|029 rod - 30.88 24.99 BBC|188 rod - 26.78 24.32 BBC|188 BBC|033BBC|033 rod + 30.55 29.96 BBC|035BBC|035 rod + 35.45 27.15 BBC|048BBC|048 coccus + 28.27 29.42 BBC|110BBC|110 coccus + 30.40 30.21 BBC|038BBC|038 coccus - 27.04 38.92 BBC|118BBC|118 rod - 31.26 38.78 BBC|186BBC|186 rod - 36.42 39.36 BBC|042BBC|042 coccus + 35.14 38.67 BBC|056BBC|056 coccus + 34.15 39.06 BBC|205BBC|205 coccus + 41.30 40.48 II BBC|150BBC|150 coccus - 20.13 37.06 BBC|162BBC|162 rod + 25.85 38.93 BBC|189BBC|189 rod - 21.70 37.57 BBC|016BBC|016 rod - 41.78 28.01 BBC|020BBC|020 rod - 39.96 34.79 BBC|024BBC|024 rod - 43.48 38.80 BBC|027BBC|027 rod - 39.06 36.71 BBC|184BBC|184 rod - 38.94 30.61 BBC|077BBC|077 coccus + 51.00 38.13 BBC|177BBC|177 coccus + 47.67 33.13 III BBC|148BBC|148 coccus + 49.31 26.57 BBC|006BBC|006 rod - 15.41 14.20 BBC|169BBC|169 coccus + 28.18 19.89 BBC|044BBC|044 coccus + 16.22 29.81 IV BBC|111BBC|111 rod + 18.25 25.52 BBC|163BBC|163 rod + 19.06 30.12 BBC|167BBC|167 rod + 19.70 31.80 BBC|152BBC|152 rod - 18.85 26.68 BBC|210BBC|210 rod + 12.32 32.80

0.50 0.40 0.30 0.20 0.10 0.00 Dissimilarity Figure 2. Dendrogram displaying the relationship between strains in the initial set. The dendrogram was obtained from the integration of the results from the laboratory routine characterization, and the NAUCs from the rapid growth screening. The dendrogram was generated from the projection matrix obtained in the PCA, using UPGMA as the clustering method. The cophenetic correlation for the whole dendrogram is 0.72 (computed from a two-way Mantel test). The scale corresponds to the reduced Euclidean distance. The cut-off value was set at 0.28 of reduced Euclidian distance, producing 4 clusters (I to IV), of which one is a single-member cluster. The 25 strains that were selected for the working set are shown in bold; NAUC TSB, and NAUC NB are the NAUC values after 36 h of incubation in the corresponding media; +, Gram-positive; -, Gram-negative.

Table 6. NAUCs for 36 h incubation in TSB and NB for clusters I, II, and IV*. Medium Max# Min Mean STDEV CV Cluster I TSB 52.01 42.15 45.91 5.33 12% NB 49.72 42.23 44.91 4.17 09% Cluster II TSB 51.00 20.13 33.95 6.87 20% NB 40.48 24.32 33.64 4.60 14% Cluster IV TSB 28.18 12.32 18.50 4.60 25% NB 32.80 14.20 26.35 6.41 24% * Cluster III comprises the strain BBC|184, with a NAUC TSB of 49.31 and a NAUC NB of 26.57. # Max, maximum; Min, minimum; STDEV, sample standard deviation; CV, coefficient of variation.

From cluster II were selected 20 strains: BBC|005, BBC|016, BBC|020, BBC|021, BBC|024, BBC|027, BBC|029, BBC|031, BBC|032, BBC|034, BBC|035, BBC|056, BBC|077, BBC|118, BBC|161, BBC|170, BBC|177, BBC|184, BBC|186, and BBC|205. Cluster III comprises the strain BBC|184, with a high NAUC at 36 h for TSB, and a medium to low NAUC for NB. As the criterion for selection of strains was the inclusion of representatives

35 from all clusters, strain BBC|184 was selected for the working set. Cluster IV, with eight elements, incorporates strains with low to medium mean NAUC values, for both TSB and NB, and with a high CV. From this cluster, only the strain BBC|169 was selected to incorporate the working set. Unlike the plot of the strains in the PCA space, the plot of NAUCs at 36 h in TSB versus NB easily discriminates clusters of strains with distinct response (see Figure 3).

Figure 3. Spatial dispersion of the strains in the initial set by NAUCs for 36 h in TSB and NB. Strains are grouped with different colors by the clusters (I to IV) obtained from the 0.28 cut-off in the dendrogram generated after the PCA. Cluster I includes strains with high NAUCs for both TSB and NB, whereas cluster IV aggregates strains with low to medium NAUCs for both media. Cluster II and cluster III incorporate strains with medium NAUCs in NB, and medium to high NAUCs in TSB. Dotted lines intersect the mean point obtained for NAUCs for 36 h of incubation for TSB and NB (32.34, 32.94).

In comparison to NB, NAUCs at 36 h in TSB showed an overall similar mean NAUC (32.34 versus 32.94), but higher CV (30% versus 20%). TSB is a richer medium than NB, having in its composition glucose, and dipotassium phosphate, to prevent the acidification of the medium during incubation, which may be responsible for the difference in the NAUC results. For some strains, this characteristic may be beneficial, like for BBC|148 and BBC|177, neutral, like for several strains from clusters I and II, or detrimental, like for BBC|044 and BBC|210, in cluster IV.

For the working set a total of 25 strains were selected, corresponding to a reduction of 45% of the number of strains from the initial selection). The set includes 1 Gram-positive rod, 7 Gram-positive cocci, and 17 Gram-negative rods.

3.2. Phenotypic characterization of the strains in the working set 3.2.1. Physiological relationship between strains The physiological relationship between strains was assessed by PCA. For the PCA, data from all assays in the characterization of the strains in the working set was integrated into a matrix of 25 OTUs versus 42 variables. The variables considered were: NAUCs at 36 h for all substrates in the sole carbon source assay; NAUCs at 120 h for the biomass, the pH, and the salinity assays; and NAUCs at 72 h for the temperature assay. See Appendix E for detailed results.

36 The first three PCs explained 52.39% of the total variance in the original data (PC1 with 21.91%; PC2 with 17.28%; and PC3 with 13.20%). Since these three PCs are unable to explain at least 75% of the total variance, a reliable three-dimensional representation of the clustering cannot be obtained from the PCA. However, the dendrogram constructed from the projection matrix, which is not limited to the first three PCs, showed a cophenetic correlation of 80% (two-way Mantel test) between the dissimilarity matrix and the matrix of cophenetic values, allowing the grouping of strains.

The working set was divided into six clusters, with two single-member clusters, when considered a cut-off value of 23% dissimilarity. Cluster I comprehends three strains; cluster II nine strains; cluster III seven strains; cluster IV four strains; and clusters V and VI are single-member clusters (see Figure 4). BBC|035 appears isolated from all other strains in a single-member cluster with more than 30% distance; BBC|128 appears in a single-member cluster at 28% distance but associated with the remaining 23 strains. The two closest strains, BBC|003 and BBC|008, are separated by less than 6% dissimilarity.

Figure 4. Dendrogram displaying the relationship between strains in the working set. The dendrogram was obtained from the integration of the results from the sole carbon source assay, biomass assay, and influence of environmental factors assay. The dendrogram was generated from the projection matrix obtained in the PCA, using UPGMA as the clustering method. The cophenetic correlation for the whole dendrogram is 0.80 (computed from a two-way Mantel test). The scale corresponds to the reduced Euclidean distance. The cut-off value was determined at 0.23 of reduced Euclidian distance, producing six clusters (I to VI), of which two are single-member clusters.

3.2.2. Sole carbon source utilization No growth was observed in any of the strains incubated in the pH-adjusted YNB. In the conditions of the assay, all strains in the working set require an additional carbon source to grow.

Only two strains (BBC|161 and BBC|170) were able to grow (at least 5% of the reference growth) with all 23 substrates (see Figure 5). On the other hand, only three substrates (glucose, mannose, and cellobiose), were used as sole carbon source by all strains (see Figure 6).

37 BBC|161 100% 96% 94% 100% 86% 82% 80% 81% 72% 76% 75% 68% 47% 50% 38% 37% 29% 20% 18% 25% 14% 16% 12% 11% 16% 11% 14% 13%

Relative Growth Relative 0%

Substrates BBC|170 100% 100%

75% 58% 49% 50% 40% 45% 42% 45%44% 26% 27% 28% 23% 25% 14% 11%14% 16% 15% 17% 7% 10% 5% 10% 10% 7%

Relative Growth Relative 0%

Substrates Figure 5. Relative growth for BBC|161 and BBC|170 for all substrates in the sole carbon source. Growth in TSB (left column, in red) was used as reference growth for each strain. Substrates are divided by color according to compound type: green for monosaccharides, violet for disaccharides, orange for sugar alcohols, and blue for esters. TSB, Trypto-casein Soy Broth; Ara, arabinose; Gal, galactose; Glu, glucose; Man, mannose; Rib, ribose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; AraL, arabitol; GalL, galactitol; GlyL, glycerol; InoL, myo-inositol; ManL, mannitol; RibL, ribitol; SorL, sorbitol; XylL, xylitol; MB, methyl butyrate; T20, Tween 20; T40, Tween 40; T60, Tween 60; T80, Tween 80.

A Glucose

125% 97% 100% 92% 87% 80% 82% 73% 68% 65% 65% 75% 59% 63% 50% 52% 54% 49% 45% 42% 50% 36% 40% 35% 40% 29% 26% 22%

25% 14% Relative Growth Relative 0%

Strains in the Working Set B Mannose

125% 113% 103% 100% 82% 82% 84% 82% 75% 67% 66% 53% 56% 59% 46% 45% 50% 50% 50% 41% 36% 37% 26% 31% 28% 31% 31% 29%

25% 11% Relative Growth Relative 0%

Strains in the Working Set C Cellobiose 128% 125% 115% 99% 102% 100% 85% 85% 78% 76% 71% 76% 75%

50% 40% 40% 37% 31% 32% 28% 18% 23% 17% 19% 19% 25% 9% 13% 10% Relative Growth Relative 7% 0%

Strains in the Working Set Figure 6. Relative growth for all strains in the working set using glucose, mannose, or cellobiose as sole carbon source. Growth in TSB was used as reference growth for each strain. Several strains showed higher NAUCs using mannose or cellobiose as sole carbon source, than with the reference medium. Strains are grouped in clusters ordered from I to VI (each with a different color) according to the dendrogram generated after PCA, presented in Figure 4.

38 At least one strain showed strong positive growth (equal or superior to 70% of the reference growth) in all substrates, except for galactitol, xylitol, and methyl butyrate. The highest relative growth with galactitol was observed for BBC|184, reaching 64% of the reference growth. With methyl butyrate, the highest value was observed for BBC|021, corresponding to 63% of relative growth. With xylitol, the highest value was 15%, for strain BBC|128. The highest relative growth was achieved by strain BBC|186 with cellobiose, and with Tween 20, reaching 128% of the reference growth with both substrates. In 13 out of the 23 tested substrates, at least one strain showed higher growth than with the referenceNumber of Strains medium with Strong (see Positive Figure Growth 7).

12 11 10 10 8 8 8 7 6 6 6 6 5 5 5 5 5 4 4 4 Growth > 70% 4 3 3 3 Growth ≥ 100% 2 2 2 2 2 2 1 1 1 Number ofNumberStrains 1 1 1 1 1 0

Substrates Figure 7. Number of strains with strong positive growth by substrate as sole carbon source. Growth in TSB was used as reference growth for each strain. In orange the number of strains with at least 70% of the reference growth for each substrate; in blue the number of strains for each substrate with growth higher than the reference growth. The data refers to the working set. Ara, arabinose; Gal, galactose; Glu, glucose; Man, mannose; Rib, ribose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; AraL, arabitol; GalL, galactitol; GlyL, glycerol; InoL, myo-inositol; ManL, mannitol; RibL, ribitol; SorL, sorbitol; XylL, xylitol; MB, methyl butyrate; T20, Tween 20; T40, Tween 40; T60, Tween 60; T80, Tween 80.

3.2.2.1. Sugars Monosaccharides Galactose, glucose, mannose, fructose, and sorbose are isomer hexoses with the molecular formula

C6H12O6. Arabinose, ribose, and xylose are isomer pentoses with the formula C5H10O5. Aldoses are monosaccharides with an aldehyde group [42]; ketoses are monosaccharides with a ketone group. In this study, only fructose and sorbose, belong to the last group. All aldoses are reducing sugars, being able to donate electrons to other compounds, due to possessing a free carbonyl group. Ketoses require isomerization to aldoses in order to become reducing sugars [302].

Less than half of the working set (12 in 25 strains) was able to grow with all the tested monosaccharides. Overall mean NAUC was higher for with glucose (20.95), followed by mannose (20.10), both aldoses, and fructose (19.13), a ketose; the lower mean NAUC was observed with sorbose (7.80), also a ketose.

Galactose, glucose, and mannose are physiologically important aldohexoses [42]. The growth relations between these monosaccharides as sole carbon source are shown in Figure 8. Among these monosaccharides, the lowest overall mean NAUC was found for galactose (14.93). The lower mean NAUC for galactose is due the fact that four strains (BBC|020, BBC|035, BBC|170, and BBC|177) showed NAUC values bellow 5.00, which was not observed for any strain for the other two aldohexoses. BBC|027 was the only strain showing a strong growth (at least 70% of the reference growth) with all three aldohexoses. The growth relations between the pentoses (arabinose, ribose, and xylose) as sole carbon source are shown in Figure 9. Overall mean NAUC was higher for ribose (18.17) than for the other two aldoses (16.63 for arabinose, and 15.78 for xylose, respectively).

39 Ara versus Rib Ara versus Rib Rib versus Xyl A B C 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Glucose 36h Glucose NAUC NAUC Mannose 36h MannoseNAUC 5 36h MannoseNAUC 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Galactose 36h NAUC Galactose 36h NAUC Glucose 36h Figure 8. Scatterplots showing the growth relation between the isomer aldohexoses tested as sole carbon source. A) Galactose versus glucose, mean point (14.93, 20.95). B) Galactose versus mannose, mean point (14.93, 20.10). C) Glucose versus mannose, mean point (20.95, 20.10). Data refers to 36 h of incubation.The dotted lines intersect the mean point for each pair of aldohexoses, dividing the graphics into quadrants.

The higher NAUC for ribose is an expected result since ribose is a precursor in the production of nucleic acids. Six strains (BBC|003, BBC|008, BBC|020, BBC|027, BBC|029, and BBC|118) showed strong growth with all three pentoses. Ara versus Rib Ara versus Rib Rib versus Xyl A B C 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Xylose 36h Xylose NAUC 36h Xylose NAUC NAUC Ribose 36h RiboseNAUC 5 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Arabinose 36h NAUC Arabinose 36h NAUC Ribose 36h Figure 9. Scatterplots showing the growth relation between the isomer pentoses tested as sole carbon source. A) Arabinose versus ribose, mean point (16.63, 18.17). B) Arabinose versus xylose, mean point (16.63, 15.78). C) Ribose versus xylose, mean point (18.17, 15.78). The dotted lines intersect the mean point for each pair of pentoses, dividing the graphics into quadrants.

Overall results show that when a strain showed high growth with one aldohexose, there was a tendency to also show high growth with the other aldohexoses (with a few exceptions, especially with galactose). The inverse (low growth for both monosaccharides) is also true. This behavior was also observed for the pentoses (here, the main exception is with xylose). This tendency was not observed for the tested ketoses (fructose and sorbose). As stated above, the mean growth on sorbose was the lowest for all monosaccharides, with only three strains, BBC|020, BBC|024, and BBC|027, showing high NAUCs for both ketoses.

Disaccharides

Cellobiose and lactose are isomer disaccharides, with the formula C12H22O11. Both are reducing sugars. Cellobiose is hydrolyzed to glucose by the action of β-glucosidases (EC 3.2.1.21) that catalyze the cleavage of the β-1,4 bonds between the two glucose residues. Lactose is hydrolyzed to glucose and galactose by microbial β-galactosidases (EC 3.2.1.23) that catalyze the cleavage of the β-1,4 bond between the two sugar residues. Growth with any of these disaccharides as sole carbon source requires the production of the corresponding enzyme. The relation between the disaccharides as sole carbon source, as well the relations between cellobiose and lactose and the monosaccharides resulting from their hydrolysis, are shown in Figure 10.

40 Ara versus Rib Ara versus Rib Rib versus Xyl Rib versus Xyl A B C D 40 40 40 40 35 35 35 35 30 30 30 30 25 25 25 25 20 20 20 20 15 15 15 15

10 10 10 10

NAUC Lactose 36h LactoseNAUC NAUC Glucose 36h GlucoseNAUC

5 36h GlucoseNAUC 5 5 5 NAUC Galactose 36h GalactoseNAUC 0 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Cellobiose 36h NAUC Cellobiose 36h NAUC Lactose 36h NAUC Lactose 36h Figure 10. Scatterplots showing the growth relation between cellobiose and lactose, and between the disaccharides and the monosaccharides that result from their hydrolysis. A) Cellobiose versus lactose, mean point (19.46, 13.27). B) Cellobiose versus glucose, mean point (19.46, 20.95). C) Lactose versus glucose, mean point (13.27, 20.95). D) Lactose versus galactose, mean point (13.27, 14.93). Data refers to 36 h of incubation. The dotted lines intersect the mean point for each pair of sugars, dividing the graphics into quadrants.

As observed for most monosaccharides, most strains showed high NAUCs for both disaccharides (strains in the first quadrant in Figure 10A) or the opposite behavior (strains in the third quadrant). In the first case, these strains were able to produce the two types of hydrolases required to cleave the disaccharides into its residues. In the second case, the strains showed poor/low activicty for both hydrolases. There were also exceptions, with strains showing high NAUC for lactose and low NAUC for cellobiose (four strains in the second quadrant), and the opposite behavior (two strains in the fourth quadrant). Moreover, there is no clear linear relationship between the growth with disaccharides and the growth with the monosaccharides resulting from their hydrolysis (Figure 10B to 10D). Overall mean growth was higher for cellobiose (19.46) than for lactose (13.27). All strains grew with cellobiose, and only one strain (BBC|148) was not able to grow with lactose. Four strains (BBC|027, BBC|034, BBC|035, and BBC|186) showed strong positive growth for both disaccharides and can be considered as good hydrolase producers for both types of enzymes.

Supplementary Table 9 in Appendix F summarizes the utilization of sugars as sole carbon source for all strains in the working set. Strain BBC|027, in cluster I, was the only strain that showed strong positive growth with all sugars, with a mean relative growth of 103% (minimum 87%, maximum 116%). In fact, all other strains showed at least weak or negligible growth with one or more sugars. In the opposite extreme to BBC|027, strain BBC|148, in cluster III, was not able to grow on five of the 10 substrates (arabinose, ribose, xylose, sorbose, and lactose). This strain also showed the lower mean relative growth with sugars of all strains (10%).

3.2.2.2. Sugar alcohols Sugar alcohols are polyhydric alcohols (polyols, alcohols with multiple hydroxyl groups), derived from the reduction of the carbonyl group of a monosaccharide to a hydroxyl group [42]. Sugar alcohols may also derive from other carbohydrates, including disaccharides (e.g., maltitol, derived from maltose) [303]. Sugar alcohols are common compatible solutes (osmolytes) in fungi, plants, and animals [304]. Most polyols occur naturally, and are used extensively as bulk sweeteners in the food industry [305]. Chemically, polyols can be cyclic, or acyclic compounds [303]. All tested sugar alcohols were acyclic, except for myo-inositol.

Glycerol is the simplest of sugar alcohols, with three hydroxyl groups and formula C3H8O3. It occurs naturally, in the form of triglyceride, in fats and oils [306].

41 Myo-inositol is a sugar alcohol similar to glucose, with the formula C6H12O6. It is found in cell membranes in the form of phospholipids26.

Arabitol, ribitol, and xylitol are isomer sugar alcohols with five carbons (pentitols), with the formula C5H12O5. Arabitol, or arabinitol, is a pentitol derived from the reduction of arabinose [307]. Ribitol is derived from ribose27. Xylitol, derived from xylose, is a naturally occurring polyol in fruits and vegetables [308]. Ribitol and xylitol are epimers that differ in the chirality of the central carbon, in the melting point, and in density [309,310].

Galactitol, mannitol, and sorbitol are isomer sugar alcohols with six carbons, with the formula C6H14O6. Galactitol, or dulcitol, is derived from the reduction of galactose [308]. Mannitol is derived from the reduction of mannose or fructose [308]. Sorbitol, or glucitol, is derived from the reduction of glucose28. Mannitol and sorbitol occur naturally in appreciable amount: sorbitol in fruits and berries; mannitol in marine algae and tree exudates [308].

Supplementary Table 10 in Appendix F summarizes the utilization of sugar alcohols as sole carbon source for all strains in the working set. Overall mean growth was higher with mannitol (20.37), with similar growth values to glucose, followed by glycerol (18.23), and sorbitol (12.92). Only 10 strains grew with xylitol, but none was able to surpass 15% of the reference growth. Strains BBC|027 (in cluster I), BBC|161, BBC|169, BBC|170, and BBC|186 (in cluster II) were able to grow with all sugar alcohols tested as sole carbon source. BBC|027 showed strong positive growth with five of the eight sugar alcohols, with relative growths above 100% for myo-inositol, ribitol, and sorbitol. The relations between growth with the sugar alcohols with higher mean NAUC are shown in Figure 11. GlyL versus ManL GlyL versus SorL ManL versus SorL A B C 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Sorbitol 36h Sorbitol NAUC NAUC Sorbitol 36h SorbitolNAUC NAUC MannitolNAUC36h 5 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Glycerol 36h NAUC Glycerol 36h NAUC Mannitol 36h Figure 11. Scatterplots showing the growth relation between glycerol and mannitol, glycerol and sorbitol, and between mannitol and sorbitol when present as sole carbon source. A) Glycerol versus mannitol, mean point (18.23, 20.37). B) Glycerol versus sorbitol, mean point (18.23, 12.92). C) Mannitol versus sorbitol, mean point (20.37, 12.92). Data refers to 36 h of incubation. The dotted lines intersect the mean point for each pair of sugar alcohols, dividing the graphics into quadrants.

The growth relations between mannitol and the sugars that originate this sugar alcohol by reduction (mannose or fructose), and between sorbitol and glucose are shown in Figure 12. A linear relationship should be clear in these scatterplots if the strains show the same capacity to use sugars or their sugar alcohols counterparts as sole carbon source, with the strains falling mostly in the first and third quadrants (high NAUC for both substrates, and low NAUC for both substrates, respectively, and otherwise close to the mean point for both substrates. For sugar alcohols this was only observed for the sorbitol and glucose pair.

26 NCBI. PubChem Database. Inositol, CID=892, https://pubchem.ncbi.nlm.nih.gov/compound/myo-inositol 27 CHEBI:15963 - ribitol, https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:15963 28 NCBI. PubChem Database. Sorbitol, CID=5780, https://pubchem.ncbi.nlm.nih.gov/compound/sorbitol

42 ManL versus Man ManL versus Fru SorL versus Glu A B C 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Glucose 36h GlucoseNAUC NAUC Fructose 36h FructoseNAUC NAUC Mannose 36h MannoseNAUC 5 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Mannitol 36h NAUC Mannitol 36h NAUC Sorbitol 36h Figure 12. Scatterplots showing the growth relation between the mannitol and mannose, mannitol and fructose, and sorbitol and glucose, when present as sole carbon source. A) Mannitol versus mannose, mean point (20.37, 20.10). B) Mannitol versus fructose, mean point (20.37, 19.13). C) Sorbitol versus glucose, mean point (12.92, 20.95). Data refers to 36 h of incubation. The dotted lines intersect the mean point for each pair of substrates dividing the graphics into quadrants.

3.2.2.3. Esters Methyl butyrate29, is the methyl ester of butyric acid (4-carbon saturated acid), and one of the simplest esters. It is a clear colorless liquid, less dense than water, and moderately soluble in water (15 g/l), frequently used to measure esterase activity [33]. Tweens are nonionic surfactants soluble in water (50-100 g/l), resulting from the esterification of a fatty acid and polyoxyethylene sorbitan [311]. Tweens are used in the food industry, in pharmaceuticals, and in cosmetic preparations, as emulsifiers, wetting agents, and stabilizers. Tweens are also used as flavor dispersants. Tween 2030 (polysorbate 20) is a sorbitan ester of lauric acid (12-carbon saturated fatty acid); Tween 4031 (polysorbate 40) is a sorbitan ester of palmitic acid (16-carbon saturated fatty acid); Tween 6032 (polysorbate 60) is a sorbitan ester of stearic acid (18-carbon saturated fatty acid); and Tween 8033 (polysorbate 80) is a sorbitan ester of oleic acid (18-carbon monounsaturated ω-9 fatty acid). Tween 80 was the only ester of an unsaturated fatty acid tested.

Supplementary Table 11 in Appendix F summarizes the utilization of esters as sole carbon source for all strains in the working set. Overall mean NAUC was similar between all the Tweens (14.75 for Tween 20; 15.48 for Tween 40; 15.81 for Tween 60; 15.89 for Tween 80), and lower for methyl butyrate (2.75). Only eight strains were able to grow with methyl butyrate as sole carbon source; from these, strains BBC|021 and BBC|186 were able to reach at least 50% of the reference growth at 36 h of incubation. Several strains grew better with Tweens than with TSB: BBC|186 (cluster II) with Tweens 20, 40, and 60; BBC|021 and BBC|029 (cluster IV) with Tween 40; and BBC|118 (also cluster IV) with all the tested Tweens. At least one strain was not able to grow, or showed negligible growth, with one of the Tweens: BBC|016 (cluster III) with Tween 20, and with Tween 40; BBC|005 (cluster II) with Tween 60; and BBC|035 (cluster VI) with Tween 80. The relation between growth with the Tweens is shown in Figure 13. A more or less linear relation is observable in all graphics, indicative of similar behavior with all the Tweens for the strains in the working set.

29 NCBI. PubChem Database. Methyl butyrate, CID=12180, https://pubchem.ncbi.nlm.nih.gov/compound/methylbutyrate 30 NCBI. PubChem Database. Polysorbate 20, CID=443314, https://pubchem.ncbi.nlm.nih.gov/compound/Polysorbate-20 31 NCBI. PubChem Database. CID=92329579, https://pubchem.ncbi.nlm.nih.gov/compound/Polyoxyethylene-sorbitan-monopalmitate 32 NCBI. PubChem Database. CID=24832100, https://pubchem.ncbi.nlm.nih.gov/compound/24832100 33 NCBI. PubChem Database. Polysorbate 80, CID=5284448, https://pubchem.ncbi.nlm.nih.gov/compound/Polysorbate-80

43 T20 versus T40 T20 versus T60 T20 versus T80 A B C 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Tween 40 40 36h Tween NAUC NAUC Tween 80 80 36h Tween NAUC 5 60 36h Tween NAUC 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Tween 20 36h NAUC Tween 20 36h NAUC Tween 20 36h D T40 versus T60 T40 versus T80 T60 versus T80 E F 40 40 40 35 35 35 30 30 30 25 25 25 20 20 20 15 15 15

10 10 10

NAUC Tween 60 60 36h Tween NAUC NAUC Tween 80 80 36h Tween NAUC 5 80 36h Tween NAUC 5 5 0 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 NAUC Tween 40 36h NAUC Tween 40 36h NAUC Tween 60 36h Figure 13. Scatterplots showing the growth relation between the Tweens tested as sole carbon source. A) Tween 20 versus Tween 40, mean point (14.75, 15.48). B) Tween 20 versus Tween 60, mean point (14.75, 15.81). C) Tween 20 versus Tween 80, mean point (14.75, 15.9). D) Tween 40 versus Tween 60, mean point (15.48, 15.81). E) Tween 40 versus Tween 80, mean point (15.48, 15.89). F) Tween 60 versus Tween 80, mean point (15.81, 15.89). Data refers to 36 h of incubation. The dotted lines intersect the mean point for each pair of Tweens, dividing the graphics into quadrants.

Save a few punctual exceptions, strains able to grow with one of the Tweens as sole carbon source are also able to grow with the other tested Tweens, with the inverse being also true.

3.2.2.4. Integration of all the results in the sole carbon source assay Table 7 summarizes the utilization of sugars, sugar alcohols, and esters as sole carbon source for all strains in the working set. The results were integrated according to the clusters in the dendrogram in Figure 4, considering five categories of carbon source (aldoses, ketoses, disaccharides, sugar alcohols, and esters), and four growth classes (negligible growth, when below 5% of the reference growth, in TSB; weak growth, when equal or above 5% and below 30%; positive growth, when equal or above 30% and below 70%; and strong positive growth, when equal or above 70%).

Strains from cluster I (BBC|003, BBC|008, and BBC|027) showed overall mean strong positive growth with aldoses; overall mean strong positive and overall mean positive growth with ketoses and disaccharides; overall mean positive growth with sugar alcohols; and overall mean weak growth with esters. All strains in cluster II, except for BBC|186, showed overall mean weak growth on ketoses. BBC|186 was also the only strain to show overall mean strong positive growth, both with disaccharides and esters. All the other strains in the cluster showed overall mean positive or overall mean weak positive growth with the five substrate categories. Cluster III was the only cluster with strains that showed overall mean negligible growth (BBC|031 with sugar alcohols, and BBC|148 with ketoses), and a strain in which growth was not observed for both ketoses, with NAUC equal to zero (BBC|077). Curiously, this cluster also has two of the three strains (BBC|020 and BBC|024) that showed overall mean strong positive growth with ketoses, the other being BBC|027, in cluster I. Strains in cluster IV showed consistently overall mean strong positive and overall mean positive growth with aldoses and esters.

44 Table 7. Phenotypic characterization – Sugars, sugar alcohols, and esters utilization by each strain Aldoses* Ketoses Disaccharides Sugar Alcohols Esters Cluster I# BBC|003 ++ + + + ± BBC|008 ++ + + + ± BBC|027 ++ ++ ++ + ± Cluster II BBC|005 + ± ± ± + BBC|032 + ± + ± ± BBC|161 + ± ± + + BBC|169 ± ± ± ± ± BBC|170 ± ± + ± + BBC|177 ± ± + ± ± BBC|184 + ± + + + BBC|186 + + ++ + ++ BBC|205 ± ± ± ± ± Cluster III BBC|016 + + + ± ± BBC|020 + ++ ± + + BBC|024 + ++ ± ± ± BBC|031 + + ± - ± BBC|056 + ± ± ± ± BBC|077 ± NG ± ± ± BBC|148 ± - ± ± ± Cluster IV BBC|021 + ± ± ± ++ BBC|029 ++ + ± ± ++ BBC|034 + + ++ ± + BBC|118 ++ + + + ++ Cluster V BBC|128 ± + + ± ± Cluster VI BBC|035 + + ++ ± ± * Results divided into four classes: -, negligible growth; ±, weak growth; +, positive growth; ++, strong positive growth. # Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. NG, no growth observed (NAUC = 0).

Growth with the substrates of the remaining categories falling in the overall mean positive or overall mean weak positive, except for BBC|034, with disaccharides. Strain BBC|128 (cluster V) did not show overall mean strong positive growth with any category, and strain BBC|035 (cluster IV) only showed overall mean strong positive growth with disaccharides.

When considering all substrates in each category, strain BBC|027 (cluster I) showed more than 100% of the relative growth with aldoses, ketoses, and disaccharides; and BBC|186 (cluster II), with disaccharides. If only Tweens are considered in the ester category, strains BBC|186 and BBC|021 (cluster IV) also showed overall mean higher growth than the reference value.

3.2.3. Biomass production Figure 14 shows the NAUC evolution through time for each tested medium, and for each strain. All strains were able to grow in the four media. Most strains showed higher NAUCs in TSB. Two strains, BBC|031 (cluster III), and BBC|035 (cluster V), showed similar high growth in TSB and BHI (difference equal or inferior to 5%). Strain BBC|020 (cluster III) was the only strain that showed higher growth in BHI. For this strain, NAUC at 120 h for BHI corresponded to 133% of the corresponding NAUC in NB (the second highest), and to 141% of the NAUC for TSB. Overall results show that strains grew better in TSB, followed by BHI, even though TSB showed the higher dispersion of all NAUC results. LB showed the poorer performance, with an overall mean NAUC at 120 h almost half than the mean NAUC for TSB (see Table 8).

45 Cluster I Cluster V Cluster VI BBC|003 BBC|008 BBC|027 BBC|128 BBC|035 NB NB NB NB NB

BHI TSB BHI TSB BHI TSB BHI TSB BHI TSB

LB LB LB LB LB Cluster II BBC|005 BBC|032 BBC|161 BBC|169 BBC|170 NB NB NB NB NB

BHI TSB BHI TSB BHI TSB BHI TSB BHI TSB

LB LB LB LB LB

BBC|177 BBC|184 BBC|186 BBC|205 NB NB NB NB

BHI TSB BHI TSB BHI TSB BHI TSB

LB LB LB LB Cluster III BBC|016 BBC|020 BBC|024 BBC|031 BBC|056 NB NB NB NB NB

BHI TSB BHI TSB BHI TSB BHI TSB BHI TSB

LB LB LB LB LB

BBC|077 BBC|148 NB NB

BHI TSB BHI TSB

LB LB BBC|003 Cluster IV NB BBC|021 BBC|029 BBC|034 BBC|118 NB NB NB NB BHI TSB

BHI TSB BHI TSB BHI TSB BHI TSB

LB 0 200 LB LB LB LB 24h 48h 72h 96h 120h Figure 14. NAUC evolution through time in the biomass assay. Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. All graphics are at the same scale for easier reading. Scale 0-200, with divisions at increments of 50 units. NB, Nutrient Broth; TSB, Trypto-casein Soy Broth; LB, LB Broth (Miller); BHI, Brain-Heart Infusion Broth.

Table 8. NAUCs for 120 h of incubation in the biomass assay. Medium Max* Min Mean STDEV CV NB 105.37 21.35 75.60 21.50 28% TSB 191.31 50.21 117.08 36.18 31% LB 98.75 16.35 65.47 17.86 28% BHI 141.41 29.89 87.05 27.85 31% * Max, maximum; Min, minimum; STDEV, sample standard deviation; CV, coefficient of variation.

46 3.2.4. Influence of environmental factors 3.2.4.1. Temperature Figure 15 shows the NAUC evolution through time for each tested temperature, and for each strain. All strains were able to grow at the three tested temperatures. Practically half the strains (11 in 25) grew better at 28 °C; five strains (BBC|005, in cluster II, BBC|056, BBC|077, BBC|148, in cluster III, and BBC|118, in cluster IV) grew better at 20 °C, and two strains (BBC|016 and BBC|031, in cluster III) grew better at 37 °C. Six of the remaining strains (BBC|003, BBC|008, BBC|027, in cluster I, BBC|032, BBC|186, in cluster II, and BBC|021, in cluster IV) showed similar high NAUCs (difference equal or inferior to 5%) for 20 °C and 28 °C, and one strain (BBC|020, in cluster III) showed similar NAUCs for all temperatures. At the end of the assay, the overall mean NAUC was higher for incubation at 28 °C (see Table 9). The strains in the working set are, therefore, mesophiles, i.e., with their optimal temperature in the midrange [94].

3.2.4.2. Salinity Figure 16 shows the NAUC evolution through time for each salinity level tested, and for each strain. All strains were able to grow in the interval 0-5% w/v of sodium chloride, even though most of the strains grew better in 0% or 1% w/v of sodium chloride (10 strains for 0%, and nine strains for 1% w/v of sodium chloride) (see Table 10). As all strains were able to grow in medium without added sodium chloride, the presence of salt is not required for growth for any of the strains. None of the strains can be considered halophilic, i.e., requiring sodium chloride to grow [312]. For 12 of the 25 strains, the presence of sodium chloride enabled higher growths, but for three strains (BBC|027, in cluster I, BBC|024, and BBC|148, in cluster III) the presence of salt is highly detrimental (with NAUCs for 1% w/v of sodium chloride less than half the NAUCs for medium without added sodium chloride). Growth was considered negligible for strains BBC|024 (cluster III), and BBC|029 (cluster IV) in the presence of 7% w/v of sodium chloride, and for nine strains on 9% w/v of sodium chloride. More than half of the working set (14 in 25 strains) can be considered moderate halotolerant, as they were still able to grow at 10% w/v of sodium chloride (the highest concentration tested) [312]. These salt concentrations represent a high osmotic stress for many strains in the working set, as inferred by the lower mean NAUCs, when compared with the NAUCs of lower salt concentrations. Two strains (BBC|169, in cluster II, and BBC|034, in cluster IV) showed similar higher NAUCs (difference equal or inferior to 5%) for the interval 0-1% w/v of sodium chloride; strain BBC|008 (cluster I) for the interval 1-3% w/v of sodium chloride; and strains BBC|016 and BBC|056 (cluster III) for the interval 0-3% w/v of sodium chloride. BBC|005 (cluster II) showed higher growth at 3% w/v of sodium chloride, and BBC|035 (cluster V) at 5% w/v of sodium chloride.

3.2.4.3. pH Figure 17 shows the NAUC evolution through time for each pH tested, and for each strain. All strains were able to grow between pH 5 and pH 11. At pH 3 (the lowest tested) only strain BBC|035 (cluster V) was able to grow. Most strains (18 in 25) showed higher growth at pH 7, followed by pH 9. The NAUC dispersion was also lower for these pH values (see Table 11). Strains BBC|008 (cluster I), and BBC|186 (cluster II) showed similar high NAUCs (difference equal or inferior to 5%) for pH 5 and pH 7. The same was observed for strain BBC|205 (cluster II) for pH 7 and pH 9. Three strains (BBC|032 and BBC|161, in cluster II, and BBC|035, in cluster V) showed higher NAUCs at pH 5, and BBC|034 (cluster IV) at pH 9.

47 Cluster I Cluster V Cluster VI BBC|003 BBC|008 BBC|027 BBC|128 BBC|035 20 °C 20 °C 20 °C 20 °C 20 °C

37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C

Cluster II BBC|005 BBC|032 BBC|161 BBC|169 BBC|170 20 °C 20 °C 20 °C 20 °C 20 °C

37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C

BBC|177 BBC|184 BBC|186 BBC|205 20 °C 20 °C 20 °C 20 °C

37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C

Cluster III BBC|016 BBC|020 BBC|024 BBC|031 BBC|056 20 °C 20 °C 20 °C 20 °C 20 °C

37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 BBC|003°C 28 °C BBC|077 BBC|148 20 °C 20 °C 5 °C

37 °C 28 °C 37 °C 28 °C 37 °C 20 °C Cluster IV BBC|021 BBC|029 BBC|034 BBC|118 20 °C 20 °C 20 °C 20 °C 28 °C 0 125 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 37 °C 28 °C 24h 48h 72h Figure 15. Effect of temperature in NAUC evolution through time. Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. All graphics are at the same scale for easier reading. Scale 0-125, with divisions at increments of 25 units.

Table 9. NAUCs for 72 h of incubation in the temperature assay. Temperature Max* Min Mean STDEV CV 20 °C 99.18 18.12 53.50 20.82 39% 28 °C 94.00 25.00 57.09 20.08 35% 37 °C 55.18 17.59 39.36 9.49 24% * Max, maximum; Min, minimum; STDEV, sample standard deviation; CV, coefficient of variation.

48 Cluster I Cluster V Cluster VI BBC|003 BBC|008 BBC|027 BBC|128 BBC|035 0% 0% 0% 0% 0%

10% 1% 10% 1% 10% 1% 10% 1% 10% 1%

9% 3% 9% 3% 9% 3% 9% 3% 9% 3%

7% 5% 7% 5% 7% 5% 7% 5% 7% 5%

Cluster II BBC|005 BBC|032 BBC|161 BBC|169 BBC|170 0% 0% 0% 0% 0%

10% 1% 10% 1% 10% 1% 10% 1% 10% 1%

9% 3% 9% 3% 9% 3% 9% 3% 9% 3%

7% 5% 7% 5% 7% 5% 7% 5% 7% 5%

BBC|177 BBC|184 BBC|186 BBC|205 0% 0% 0% 0%

10% 1% 10% 1% 10% 1% 10% 1%

9% 3% 9% 3% 9% 3% 9% 3%

7% 5% 7% 5% 7% 5% 7% 5%

Cluster III BBC|016 BBC|020 BBC|024 BBC|031 BBC|056 0% 0% 0% 0% 0%

10% 1% 10% 1% 10% 1% 10% 1% 10% 1%

9% 3% 9% 3% 9% 3% 9% 3% 9% 3%

7% 5% 7% 5% 7% 5% 7% 5% 7% 5%

BBC|077 BBC|148 0% 0%

10% 1% 10% 1%

9% 3% 9% 3% BBC|003 7% 5% 7% 5% 0% Cluster IV 10% 0.5% BBC|021 BBC|029 BBC|034 BBC|118 0% 0% 0% 0% 9% 1%

10% 1% 10% 1% 10% 1% 10% 1% 7% 3%

9% 3% 9% 3% 9% 3% 9% 3% 5% 0 125 7% 5% 7% 5% 7% 5% 7% 5% 24h 48h 72h 96h 120h Figure 16. Effect of salinity in NAUC evolution through time. Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. All graphics are at the same scale for easier reading. Scale 0-125, with divisions at increments of 25 units.

Table 10. NAUCs for 120 h of incubation in the salinity assay. Salinity Max* Min Mean STDEV CV 0% NaCl 99.87 36.93 71.03 17.53 25% 1% NaCl 120.55 28.42 66.12 23.01 35% 3% NaCl 102.50 11.97 51.61 24.06 47% 5% NaCl 105.61 10.49 30.61 20.67 68% 7% NaCl 61.93 2.86 20.66 12.55 61% 9% NaCl 44.82 0.62 12.59 10.75 85% 10% NaCl 23.96 0.42 9.48 8.35 88% * Max, maximum; Min, minimum; STDEV, sample standard deviation; CV, coefficient of variation; NaCl, sodium chloride.

49 Cluster I Cluster V Cluster VI BBC|003 BBC|008 BBC|027 BBC|128 BBC|035

pH 3 pH 3 pH 3 pH 3 pH 3

pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5

pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7

Cluster II BBC|005 BBC|032 BBC|161 BBC|169 BBC|170

pH 3 pH 3 pH 3 pH 3 pH 3

pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5

pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7

BBC|177 BBC|184 BBC|186 BBC|205

pH 3 pH 3 pH 3 pH 3

pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5

pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7

Cluster III BBC|016 BBC|020 BBC|024 BBC|031 BBC|056

pH 3 pH 3 pH 3 pH 3 pH 3

pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5

pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7

BBC|077 BBC|148

pH 3 pH 3

pH 11 pH 5 pH 11 pH 5

pH 9 pH 7 pH 9 pH 7 BBC|205 pH 3 Cluster IV BBC|021 BBC|029 BBC|034 BBC|118 pH 11 pH 5 pH 3 pH 3 pH 3 pH 3

pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 11 pH 5 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 pH 9 pH 7 0 200 24h 48h 72h 96h 120h Figure 17. Effect of pH in NAUC evolution through time. Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. All graphics are at the same scale for easier reading. Scale 0-200, with divisions at increments of 50 units.

Table 11. NAUCs for 120 h of incubation in the pH assay. pH Max* Min Mean STDEV CV pH 5# 153.09 5.88 65.81 54.70 83% pH 7 194.18 48.84 115.98 36.82 32% pH 9 148.17 36.08 88.96 29.38 33% pH 11 106.23 10.30 56.78 26.93 47% * Max, maximum; Min, minimum; STDEV, sample standard deviation; CV, coefficient of variation. # Results for pH between 5 and 11. At pH 3 only strain BBC|035 was able to grow, reaching a NAUC at 120 h of 37.97.

50 BBC|029 (cluster IV) was the only strain which showed a clear preference for a determined pH: NAUC for pH 7 (188.65) was almost twice of the second major NAUC, at pH 9 (95.41). These results show that most of the strains in the working set are neutrophils, i.e., with the highest growth falling between pH 5.5 and 8 [94]. Strains BBC|032 and BBC|161 can be considered slightly acidophiles, as their maximum growth was at pH 5, but these two strains are still able to grow, even though moderately, at pH 11. Strain BBC|035, being able to grow at pH 3, and showing maximum growth at pH 5 is an acidophile, but it is still able of expressive growth at pH 9.

3.2.5. Integration of the results in the phenotypic characterization Table 12 summarizes, for each strain, the best sole carbon source from each tested category (with the best overall substrate highlighted in bold), the best medium for biomass production, and the best environmental conditions (from the studied conditions in the influence of environmental factors assay) for each strain growth. The results were integrated according to the clusters in the dendrogram in Figure 4. The best conditions correspond the higher NAUC values at the end of each assay, out of the test conditions for all the assays in the phenotypic characterization.

Table 12. Phenotypic characterization – Best growth conditions for each strain. Aldose Ketose Disaccharide Sugar Alcohol Ester Medium Temperature Salinity pH Cluster I* BBC|003 Glu Fru Cello ManL Tween 80 TSB 20-28 °C 0% NaCl 7 BBC|008 Glu Fru Cello ManL Tween 80 TSB 20-28 °C 1-3% NaCl 5-7 BBC|027 Man Fru Cello ManL Tween 20 TSB 20-28 °C 0% NaCl 7 Cluster II BBC|005 Rib Fru Lac SorL Tween 40 TSB 20 °C 3% NaCl 7 BBC|032 Rib Fru Cello ManL Tween 80 TSB 20-28 °C 1% NaCl 5 BBC|161 Man Fru Cello GlyL Tween 20 TSB 28 °C 0% NaCl 5 BBC|169 Glu Fru Cello GlyL Tween 60 TSB 28 °C 0-1% NaCl 7 BBC|170 Man Sor Lac ManL Tween 20 TSB 28 °C 0% NaCl 7 BBC|177 Man Fru Cello GlyL Tween 60 TSB 28 °C 1% NaCl 7 BBC|184 Ara Fru Lac ManL Tween 40 TSB 28 °C 0% NaCl 7 BBC|186 Glu Fru Cello GlyL Tween 20 TSB 20-28 °C 1% NaCl 5-7 BBC|205 Man Fru Lac GlyL Tween 40 TSB 28 °C 1% NaCl 7-9 Cluster III BBC|016 Rib Sor Lac ManL Tween 60 TSB 37 °C 0-3% NaCl 7 BBC|020 Rib Sor Lac GlyL Tween 60 BHI 20-37 °C 1% NaCl 7 BBC|024 Xyl Fru Cello GlyL Tween 80 TSB 28 °C 0% NaCl 7 BBC|031 Man Fru Lac ManL Tween 80 TSB/BHI 37 °C 1% NaCl 7 BBC|056 Ara Fru Cello RibL Tween 60 TSB 20 °C 0-3% NaCl 7 BBC|077 Man --- Cello ManL Tween 60 TSB 20 °C 0% NaCl 7 BBC|148 Glu Fru Cello ManL Tween 60 TSB 20 °C 0% NaCl 7 Cluster IV BBC|021 Man Sor Cello GlyL Tween 40 TSB 20-28 °C 1% NaCl 7 BBC|029 Xyl Fru Lac SorL Tween 40 TSB 28 °C 0% NaCl 7 BBC|034 Rib Fru Lac ManL Tween 80 TSB 28 °C 0-1% NaCl 9 BBC|118 Xyl Fru Cello GlyL Tween 60 TSB 20 °C 1% NaCl 7 Cluster V BBC|128 Glu Fru Cello ManL Tween 60 TSB 28 °C 0% NaCl 7 Cluster VI BBC|035 Glu Fru Cello GlyL Tween 60 TSB/BHI 28 °C 5% NaCl 5 * Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. In bold, the substrates, by category, with the higher NAUCs. Glu, glucose; Man, mannose; Rib, ribose, Ara, arabinose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; ManL, mannitol; SorL, sorbitol; GlyL, glycerol; RibL, ribitol; TSB, Trypto-casein Soy Broth; BHI, Brain-Heart Infusion Broth.

3.3. Enzyme screening 3.3.1. Carboxylic ester hydrolases Tributyrin is a naturally occurring triglyceride (ester with three fatty acid residues) in milk fat, with very low solubility in water (133 mg/l)34, commonly used as substrate when screening for CEHs, and in kinetics assays [236,237].

34 NCBI. PubChem Database. Tributyrin, CID=6050, https://pubchem.ncbi.nlm.nih.gov/compound/Tributyrin.

51 Hydrolysis of methyl butyrate produces methanol and butyric acid35; hydrolysis of tributyrin produces glycerol and butyric acid [313,314]; and hydrolysis of Tweens 20, 40, 60, and 80 produces polyoxyethylene sorbitan, and the corresponding fatty acid (lauric acid, palmitic acid, stearic acid, and oleic acid) [311]. Methyl butyrate and tributyrin are esters of short acyl chains (all with only four carbons). Both non-specific esterases (active on short acyl chain esters, including short-chain triglycerides) and lipases (active on short, medium, and long acyl chain esters) are able to catalyze the hydrolysis of these two substrates [41,315]. Tweens are esters of medium (Tween 20), and long acyl chains (Tweens 40, 60, and 80). Only lipases are able to act on these substrates. In this study, strains with enzymes active only on methyl butyrate or tributyrin were considered as esterase producers; strains with enzymes active on any of the Tweens were considered as lipase producers; strains with positive results in butyrate esters and Tweens are positive for lipase production, and may also produce esterases (with these screening tests it is not possible to discriminate these two types of enzymes).

Results from the sole carbon assay relating to the use of esters as sole carbon source and the results from the tributyrin test were integrated. The five strains with the highest relative growth for each substrate in the sole carbon assay were selected as the best CEHs producers (see Supplementary Table 12 in Appendix G). All strains were able to produce CEHs, with 18 strains showing strong active CEHs in at least one of the six substrates tested. Most strains (15 in 25) did not grow, or showed negligible growth, with methyl butyrate, but 10 of these strains showed production of CEHs active in all the other tested substrates. The absence of growth with methyl butyrate does not necessarily mean that these strains are unable to hydrolyze the molecule. Instead, it can be a consequence of not being able to use the products of the hydrolysis of methyl butyrate to grow. Five strains (BBC|008, in cluster I, BBC|005, BBC|184, and BBC|186, in cluster II, and BBC|128, in cluster V) were not able to hydrolyze tributyrin, and, without the sole carbon assay, would be mistakenly categorized as nonproducers of CEHs. Six strains (BBC|161 and BBC|170, in cluster II, BBC|020 and BBC|024, in cluster III, BBC|021 and BBC|029, in cluster IV) showed production of CEHs, at varying levels, with all substrates. Strain BBC|021 (cluster IV) tested positive in the tributyrin test, showed at least 90% of the reference growth when growing with any of the Tweens tested, and 63% with methyl butyrate. This strain is highly efficient when producing CEHs that are active in all substrates tested. BBC|029 (cluster IV) showed similar results to BBC|021, but with lower relative growths in the sole carbon source assay. BBC|170 (cluster II) also showed lower relative growth than BBC|021, but had a strong positive result with tributyrin. BBC|161 (cluster II) showed a positive result with methyl butyrate (the fifth highest recorded growth), and strong positive results in all other substrates, including tributyrin. BBC|186 (cluster II) showed positive results with methyl butyrate and strong positive results with all the Tweens, but the CEHs produced by this strain were not active in tributyrin. BBC|118 (cluster IV) showed consistently better results with the Tweens than with TSB, albeit showing negligible grow with methyl butyrate. Using the criterion described above, all strains showed production of lipases, as all strains showed positive results in at least one of the tested Tweens. Three strains (BBC|008, in cluster I, BBC|005, in cluster II, and BBC|128, in cluster V) showed negative results for hydrolysis of methyl butyrate and tributyrin (and positive

35 NCBI. PubChem Database. Methyl butyrate, CID=12180, https://pubchem.ncbi.nlm.nih.gov/compound/Methyl-butyrate.

52 results for all the Tweens). None of these strains produces non-specific esterases. From these results, six strains (BBC|161, BBC|170, and BBC|186, in cluster II, BBC|021, BBC|029, and BBC|118, in cluster IV) were selected as the best CEHs producers. In this set of strains, BBC|118 was not able to grow with methyl butyrate alone, and BBC|186 showed a negative result in the tributyrin test. The remaining four strains showed positive results with all substrates tested.

3.3.2. Deoxyribonucleases Singh & Marshall, in 1966 [316], suggested an association between DNase production and pathogenicity. Several studies have targeted the association between DNase production, virulence, and the ability to evade de host’s immune defenses (examples in [317–320]). On the other hand, DNA can be used as a source of pentoses and nitrogenous bases, so the ability to produce DNases may pose a competitive advantage towards other microorganisms, as it permits the use of a readily available source of nutrients. This theory was first proposed by Fox & Holtman, in 1968 [321]. The majority of strains (18 in 25) were able to degrade DNA, and from these, 14 strains, from clusters I to IV, showed strong positive results.

3.3.3. Glycosidases 3.3.3.1. Agarases All strains tested negative for hydrolysis of agar, so none of the strains was able to use agar as carbon source. This was an expected result, since all strains in the working set were routinely grown on TSA and depressions in the agar surface were never observed. The results in the agarase assay allowed the use of YNBA as a solid media to test if several substrates that require hydrolase production to be degraded could be tested as sole carbon source. It also allowed, for some polysaccharide degraders, to determine some aspects of the mechanism of hydrolase production.

3.3.3.2. Dextranases All strains tested negative for dextranase production on TSA supplemented with dextran. When dextran was the only carbon source available, three strains (BBC|177, in cluster II, BBC|056, and BBC|148, in cluster III) were able to degrade the substrate. In these strains the production of dextran degrading enzymes is inducible, and probably only occurs in the absence of alternative carbon sources. This limits their biotechnological application, at least, in terms of bioaugmentation (since it is not expected to have only dextrans as carbon source in scenario where bioaugmentation may be a solution to consider), even though, for these strains, it is possible to explore their dextranase producing potential by inducing the dextranase synthesis in controlled conditions, and proceed to its extraction and utilization, for example, as biofilm inhibitor.

3.3.3.3. Cellulases CMC is a cellulose derivative, with β-1,4 linked glucose residues, used as emulsifier and thickening agent in foods and cosmetics, and as bulk laxative in pharmaceuticals36.

36 NCBI. PubChem Database. Carmellose sodium, CID=23706213, https://pubchem.ncbi.nlm.nih.gov/compound/Carmellose-sodium.

53 CMCase production was observed in 10 strains, when these were grown with additional carbon sources, but CMC degradation was not detected when this cellulose derivative was the only carbon source available. These strains require an initial carbon source to start off CMCase production. This was expected, as it is known that cellulase production is halted in many cellulolytic organisms, when easily metabolized carbon sources are available, and induced in the presence of cellulose [322]. In the assay with chromatography paper all strains tested negative, as all paper discs remained unaltered after 30 days. The best cellulase producers, corresponding to strong positive results on TSA supplemented with CMC, were BBC|032, BBC|169, and BBC|177 (cluster II), and BBC|077 (cluster III).

3.3.3.4. β-glucosidases For β-glucosidase screening two substrates were used: cellobiose, a disaccharide of glucose residues linked by β-1,4 bonds (in the sole carbon source assay), and esculin, a coumarin glucoside. All strains were able to grow on cellobiose as sole carbon source, so all strains are able to produce β-glucosidases (see Supplementary Table 13 in Appendix G). In 10 strains was observed strong positive growth with cellobiose (above 70% of the reference growth). From these, strains BBC|186 (cluster II), BBC|027 (cluster I), and BBC|035 (cluster VI), showed higher growth, in descending order, with cellobiose than with the reference medium. In the esculin assay, 15 strains showed positive results (from which 10 showed strong positive results); the remaining 10 strains did not lead to esculin hydrolysis. The most expressive examples are strains BBC|032 and BBC|186 (in cluster II), with strong positive growths with cellobiose, that showed negative results in the esculin assay; the same was observed in the remaining eight strains with a negative result in the esculin test, even though these strains showed weaker results with cellobiose. Like for CEHs, all these strains would be mistakenly categorized as nonproducers of β-glucosidase without the sole carbon assay. As all strains were able to produce β-glucosidases, it can be inferred that in the strains with a negative result in the esculin assay the production of β-glucosidase was probably not constitutive and was not produced when the medium had in its composition other readily available carbon sources. In the strains who tested positive in the esculin assay, assuming that the bacteria did not use the totality of the available carbon, it is possible that the production of β-glucosidases is constitutive (the strains did not need to produce the enzyme to assess a carbon source, but, nonetheless, the enzyme was present). The best β-glucosidases producers were selected as the strains with strong positive result in both tests: BBC|003, BBC|008, and BBC|027 (cluster I), BBC|177 (cluster II), BBC|034, and BBC|118 (cluster IV), and BBC|035 (cluster V).

3.3.3.5. Xylanases All strains were able to degrade xylan (to varying degrees) when grown on TSA supplement with xylan, but hydrolysis was not observed in any strain incubated on YNBA supplemented with xylan. As none of the strains were able to degrade xylan when the substrate was the sole carbon source in the medium, it was inferred that an initial source of carbon is required to start off the production of xylanases. The best xylanase producers, with strong positive results in the TSA supplemented with xylan, were BBC|027 (cluster I), BBC|032, BBC|177, BBC|186, and BBC|205 (cluster II), BBC|056 (cluster III), and BBC|128 (cluster V). Two of these strains, BBC|128, and BBC|177, were not able to grow on xylose as sole

54 carbon source. The xylanase in these bacteria may lead to the release of other monosaccharides that appear as substituents in the polysaccharide, that the bacteria can uptake and use as carbon source. As xylan is one of the most abundant hemicelluloses in land plants [59], microorganisms that both produce xylanases, and that can use efficiently xylose to grow, may play a significant role in biomass recycling. Strain BBC|027, being one of the best xylanase producers in the working set, and being able of strong growth on xylose as sole carbon source (with more than 100% of the reference growth), would be a good candidate for this function. As alternative, good xylanase producers, but that do not grow on xylose (like BBC|128 and BBC|177) can be used in consortia with bacteria that show good growth with xylose (like BBC|003, BBC|029, and BBC|118), even if these strains are not good xylanase producers. In a scenario where the objective is to produce xylose, the best strategy would be to use strains that are able to degrade xylan, but that do not uptake xylose, like strains BBC|128 and BBC|177, both strong xylanase producers, that do not grow on xylose as sole carbon source; or, to a lesser extent, BBC|205, a strong xylanase producer, with a weak growth with xylose (12% of the reference growth).

3.3.3.6. Pectinases When grown on TSA supplemented with pectin, 12 strains tested positive for pectin degradation. From these, only two strains, BBC|008 (cluster I), and BBC|170 (cluster II), showed strong positive results. In the test using YNBA supplemented with pectin, nine strains were able to degrade pectin (not requiring an initial source of cabon to produce pectinases), but none showed strong positive results. See Supplementary Table 14 in Appendix G for detailed results. Strains BBC|003 (cluster I), BBC|032 (cluster II), and BBC|128 (cluster V), only showed pectinase production when there was not an alternative carbon source. In these strains, the production of pectin degrading enzymes only occurs in the absence of alternative, and more readily available, carbon source. On the other hand, six strains, BBC|008 and BBC|027 (cluster I), BBC|205 (cluster II), BBC|020 and BBC|056 (cluster III), and BBC|035 (cluster VI), were only able to degrade pectin when another source of carbon was present, so an initial carbon source is required to induce hydrolase production in these strains. The best pectinases degrading enzymes producers were selected as the strains that were able to degrade pectin in both media: BBC|005, BBC|161, BBC|169, and BBC|170 (cluster II), BBC|016 (cluster III), and BBC|029 (cluster IV). In these strains, pectin degrading enzymes production may be constitutive or easily induced.

3.3.3.7. Amylases All strains were able to degrade soluble starch, in both TSA and YNBA supplemented with soluble starch. A total of 17 strains showed strong positive results in the TSA supplemented media, but in the test with starch as sole carbon source, only five strains showed similar results. All strains with strong positive results in the YNBA test also showed strong positive results in the TSA test. The best starch degrading strains, corresponding to strong positive results in both tests, were BBC|056 (cluster III), BBC|034 and BBC|118 (cluster IV), BBC|128 (cluster V), and BBC|035 (cluster VI).

3.3.3.8. β-galactosidases All strains, apart from BBC|148 (cluster III), were able to grow on lactose as sole carbon source and, consequently, were able to produce β-galactosidases. Four strains (BBC|027, in cluster I, BBC|186, in cluster II, BBC|034, in cluster IV, and BBC|035, in cluster VI) showed strong β-galactosidase production. From these, BBC|035, showed negligible growth in galactose, so the hydrolase may have been used to produce

55 glucose. The best producer, BBC|027, with a relative growth in lactose slightly higher than with the reference medium, showed similar high growths with glucose and galactose (both above 95% of the relative growth). The best β-galactosidase producers (in descending order) were: BBC|027, BBC|035, BBC|186, BBC|034, and BBC|184, the last with almost 70% of relative growth on lactose.

3.3.4. Peptidases Six strains produced peptidases that were able to hydrolyze at least one of the two substrates tested. See Supplementary Table 15 in Appendix G for detailed results. Strains BBC|170 (cluster II), BBC|118 (cluster IV), and BBC|035 (cluster VI) produced peptidases that were able to hydrolyze casein and gelatin; strains BBC|148 (cluster III) and BBC|021 (cluster IV) were able to hydrolyze casein, but not gelatin; and strain BBC|034 (cluster IV) was able to hydrolyze gelatin, but not casein.

3.3.5. Ureases Overall 10 strains were able to hydrolyze urea, from which six strains, BBC|003 and BBC|027 (cluster I), BBC|184 (cluster II), BBC|016 and BBC|020 (cluster III), and BBC|128 (cluster V) showed fast positive results.

3.3.6. Integration of all results in the hydrolase screening Table 13 summarizes the results in the enzyme prospection for each strain in the working set. The results were integrated according to the clusters in the dendrogram in Figure 4. Results were divided into three classes (except for peptidases): negative, positive, and strong positive. Results for peptidases were divided into two classes: negative, and positive. Four types of hydrolases appeared in all 25 strains: CEHs, that release fatty acids (essential components of the cellular membrane, and that can be used as energy source); β-glucosidases, that release glucose from β-glucosides (including cellobiose and other cellulose derivatives); xylan degrading enzymes, that release xylose and several other types of monosaccharides; and starch degrading hydrolases, that release glucose (see Figure 18). Right behind these enzymatic groups, β-galactosidases appear in all strains, except for BBC|148, and DNase production appeared in more than 70% of the strains in the working set. At the opposite end, dextranases were only detected in three strains (corresponding to 12% of the working set) and peptidase production, was detected in only six strains. More than two-thirds of the CEHs, DNases, and starch degrading enzymes showed strong activity, against only 13% of pectinases, and 17% of β-galactosidases. From the 11 enzymatic activities with positive results (none of the strains were able, as expected, to degrade agar), two strains (BBC|027, in cluster I, and BBC|032, in cluster II) showed positive results in nine distinct activities, strong positive results in six activities, and both tested negative for dextranases and peptidases (see Figure 19). BBC|034 (cluster IV) also showed strong positive results in six distinct activities but showed negative results in three activities. Strain BBC|031 (cluster III) appears at the opposite extreme, with only one strong activity (for starch degrading enzymes), and five negative activities. Strain BBC|148 (cluster III) was the only strain with positive results in the two rarest hydrolases in the working set (dextranase and peptidase).

56 Table 13. Enzyme production in the working set. CEHs* DNase DEX CELL β-GLU XYL PEC AMY β-GAL PEP URE Cluster I# BBC|003 ++ ++ • - - ++ • + + ++ + - ++ • BBC|008 + ++ • - - ++ • + ++ ++ + - - BBC|027 + ++ • - + ++ • ++ • + ++ ++ • - ++ • Cluster II BBC|005 + ++ • - - + + + • ++ + - + BBC|032 ++ ++ • - ++ • ++ ++ • + ++ + - + BBC|161 ++ • - - + + + + • + + - + BBC|169 ++ - - ++ • + + + • + + - - BBC|170 ++ • ++ • - - ++ + ++ • + + + • - BBC|177 ++ - ++ • ++ • ++ • ++ • - + + - - BBC|184 + ++ • - - + + - + + • - ++ • BBC|186 ++ • ++ • - - ++ ++ • - + ++ • - - BBC|205 ++ + - - + ++ • + + + - - Cluster III BBC|016 ++ ++ • - - ++ + + • ++ + - ++ • BBC|020 + ++ • - - + + + ++ + - ++ • BBC|024 ++ ++ • - - ++ + - ++ + - - BBC|031 + + - - + + - ++ + - - BBC|056 ++ - ++ • - + ++ • + ++ • + - - BBC|077 ++ - - ++ • + + - ++ + - - BBC|148 ++ ++ • ++ • + + + - + - + - Cluster IV BBC|021 ++ • + - - + + - ++ + + + BBC|029 ++ • ++ • - + + + + • ++ + - - BBC|034 ++ ++ • - + ++ • + - ++ • ++ • + - BBC|118 ++ • + - - ++ • + - ++ • + + • - Cluster V BBC|128 + - - + ++ ++ • + ++ • + - ++ • Cluster VI BBC|035 ++ - - - ++ • + + ++ • ++ • + • - * CEHs, carboxylic ester hydrolases, DNase, deoxyribonucleases; DEX, dextranases; CELL, cellulose degrading hydrolases; β-GLU, β-glucosidases; XYL, xylan degrading hydrolases; PEC, pectin degrading enzymes; AMY, starch degrading hydrolases; β-GAL, β-galactosidases PEP, peptidases; URE, ureases. Results divided into three classes (except for peptidases): -, negative; +, positive; ++, strong positive. Results for peptidases divided into two classes: -, negative; +, positive. •, best producers for each enzymatic activity. # Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4. Number of Strains with Strong Positive Hydrolase Activity 25 25 25 25 25 24

20 18 18 17 15 15 14 13 10 10 10 Positive 7 6 6 4 4 Strong Positive

5 3 2 Number ofStrains Number 0 CEHS DNase DEX CELL β-GLU XYL PEC AMY β-GAL PEP URE Hydrolase Activity Figure 18. Number of strains with positive results by hydrolase activity in the working set. In orange the number of strains with positive hydrolase production; in blue the number of strains with strong hydrolase production. The data refers to the working set (n = 25). CEHs, carboxylic ester hydrolases, DNase, deoxyribonucleases; DEX, dextranases; CELL, cellulose degrading hydrolases; β-GLU, β-glucosidases; XYL, xylan degrading hydrolases; PEC, pectin degrading enzymes; AMY, starch degrading hydrolases; β-GAL, β-galactosidases; PEP, peptidases; URE, ureases. All strains with dextranase and peptidase activity were considered as strong producers.

Positive and Strong Positive Hydrolase Production 10 9 9 8 8 8 8 8 8 8 8 8 8 8 8 7 7 7 7 7 7 7 7 6 6 6 6 6 6 6 6 5 5 5 5 5 5 4 4 4 4 4 4 4 3 3 3 3 2 2 2 2 2 1 1

NumberofActivities 0

Strains in the Working Set Figure 19. Number of positive and strong positive hydrolase activities by strain. Positive activity in light color; strong positive in dark color. None of the strains showed positive results for all 11 activities with positive results. Strains are grouped in clusters ordered from I to VI (each with a different color) according to the dendrogram generated after PCA, presented in Figure 4.

57 3.4. Identification of the strains in the working set 3.4.1. Identification, comparison of phenotypic characteristics, and pathogenicity analysis All the strains in the working set were found to belong to two bacterial phyla: phylum Firmicutes, with eight strains, and phylum , with the remaining 17. All strains identified as Firmicutes belong to the class Bacilli; strains identified as Proteobacteria belong to the classes (two strains), and (15 strains). The results from the identification procedure are illustrated here for strains BBC|024 (a Gram-negative rod in the phylum Proteobacteria) and BBC|035 (a Gram-positive rod in the phylum Firmicutes). Figure 20 depicts the phylogenetic trees showing the relationships for strains BBC|024 and BBC|035 with their respective closest species. The phylogenetic reconstructions for the remaining strains of the working set are shown in Appendix H, with the exception of strain BBC|027, which was only identified at the family rank.

A B T BBC|024 Bacillus atrophaeus JCM 9070 AB021181 Bacillus velezensis CR-502T AY603658 Limnohabitans australis MWH-BRAZ_DAM2DT FM178226 Bacillus nakamurai NRRL B-41091T KU836854 Limnohabitans curvus MWH-C5T AJ938026 BBC|035 Bacillus siamensis PD-A10T GQ281299 Limnohabitans parvus II-B4T FM165536 Bacillus vallismortis DSM 11031T AB021198 Limnohabitans planktonicus II-D5T FM165535 Bacillus amyloliquefaciens DSM 7T FN597644 Bacillus axarquiensis LMG 22476T DQ993671 Delftia rhizosphaerae RA6T KY075818 Bacillus halotolerans DSM 8802T AM747812 Delftia litopenaei wsw-7T GU721027 Bacillus mojavensis NBRC 15718T AB021191 T T Bacillus tequilensis 10b HQ223107 Delftia acidovorans IAM 12409 AB021417 T Bacillus subtilis subsp. subtilis DSM 10 AJ276351 Delftia lacustris 332T EU888308 Bacillus subtilis subsp. stercoris D7XPN1T JHCA01000027 T T Bacillus subtilis subsp. spizizenii NRLL B-23049 AF074970 Delftia deserti YIM Y792 KP300804 Bacillus subtilis subsp. inaquosorum KCTC 13429T AMXN01000021

Figure 20. Phylogenetic trees for strains BBC|024 and BBC|035. The trees were constructed using the Neighbor-Joining method [290], and based on partial 16S rRNA gene sequence. Bootstrap values (1000 replicates) are shown at the branching points [294]. Evolutionary distances were computed using the Jukes-Cantor method [295]. The trees are drawn to scale. Gaps, and missing data were eliminated. Evolutionary analyses were conducted in MEGA X [289]. T indicates type strain for the species; the alphanumeric sequence corresponds to the GenBank accession number to the 16S rRNA gene sequence. A) Relationship between BBC|024 and all valid species in the genera Limnohabitans and Delftia. Optimal tree with the sum of branch length = 0.1697. The analysis involved 10 nucleotide sequences, with a total of 801 positions in the final dataset. Bar, 0.01 nucleotide substitutions per site. B) Relationship between BBC|035 and closely related species in the genus Bacillus. Optimal tree with the sum of branch length = 0.0144. The analysis involved 15 nucleotide sequences, with a total of 835 positions in the final dataset. Bar, 0.001 nucleotide substitutions per site.

BBC|024 was identified as belonging to the genus Limnohabitans (Bacteria, Proteobacteria, Betaproteobacteria, , Comamonadaceae37), phylogenetically related with L. australis, as supported by the phylogenetic tree, and homology percentage (97.78%) in pair-wise alignment with the type strain of the species (see Table 14). The homology is below the 98.65% threshold to be considered from the same species.

Table 14. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for strain BBC|024 and all validly published species in the genus Limnohabitans. 1 2 3 4 5 1. BBC|024* 100.00% 2. Limnohabitans australis 97.78% 100.00% 3. Limnohabitans planktonicus 97.29% 96.85% 100.00% 4. Limnohabitans parvus 96.80% 97.37% 99.06% 100.00% 5. Limnohabitans curvus 95.94% 98.27% 97.05% 97.99% 100.00% * 16S rRNA gene sequence for BBC|024 was 812 bp. All other sequences were the ones used for the phylogenetic analysis.

37 Ranks of domain, phylum, class, order, and family.

58 BBC|035 was identified as belonging to the genus Bacillus (Bacteria, Firmicutes, Bacilli, Bacillales, Bacillaceae). For this strain, the phylogenetic tree did not allow the discrimination between the strain and the type strains of B. velezensis, B. nakamurai, and B. vallismortis. It was not possible to determine the species due to the low power of resolution of the 16S rRNA gene to identify strains to the species rank in several genera, in which the genus Bacillus is included [177,323]. In fact, several studies [186,324–326] have been conducted over the years in an attempt to infer a wide accepted phylogeny for the genus Bacillus. The pair-wise alignment allowed to remove, with a fair degree of confidence, B. velenzensis from this list, due to the lower 16S rRNA gene homology percentage (98.94%) with respect to B. nakamurai and B. vallismortis, even though all three species are within the 98.65% homology threshold (see Table 15).

Table 15. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for strain BBC|035 and phylogenetically related Bacillus species. 1 2 3 4 1. BBC|035* 100.00% 2. Bacillus nakamurai 99.05% 100.00% 3. Bacillus vallismortis 99.05% 99.67% 100.00% 4. Bacillus velezensis 98.94% 99.64% 99.72% 100.00% * 16S rRNA gene sequence for BBC|035 was 846 bp. All other sequences were the ones used for the phylogenetic analysis.

Table 16 outlines the results from the identification from partial 16S rRNA gene sequencing for all the strains in the working set.

Strain BBC|035 was isolated from WWTP active sludge. As stated above, it belongs to the genus Bacillus, and it is phylogenetically related with B. nakamurai and B. vallismortis. BBC|035 shares with these species the ability to hydrolyze starch and casein, and being able to grow on glucose, cellobiose, and mannitol. Contrarily to these species [327,328], BBC|035 is not able to grow on galactose, though it is able to grow on lactose. No reports on pathogenicity on B. nakamurai and B. vallismortis were found. In a recent study, Badnore et al. [329] reported the application of silver nanoparticles, biosynthesized by B. nakamurai, using silver nitrate and polyvinyl pyrrolidone, in a antibacterial peel-off facial mask formulation, and a B. vallismortis isolate (R2) showed promising results against phytopathogenic fungi [330], and against the polyphagous pest Spodoptera litura, a type of moth [331].

Six strains (BBC|056, BBC|077, BBC|148, BBC|169, BBC|177, and BBC|205) are Gram-positive cocci that belong to the genus Staphylococcus (Bacteria, Firmicutes, Bacilli, Bacillales, Staphylococcaceae), and are phylogenetically related with S. cohnii subsp. cohnii. Despite the genetic closeness, these six strains were isolated from five distinct sources (from several WWTPs, and from a hydrocarbon separator sample), with BBC|169 and BBC|177 being obtained in the same isolation step. It should be noted that BBC|056 shows higher homology with BBC|169, than with S. cohnii subsp. cohnii (see Table 17), and that the homology with this subspecies is below the 98.65% threshold to be considered of the same species, even though it appears associated with S. cohnii subsp. cohnii in the phylogenetic reconstruction (see Supplementary Figure 3A in Appendix H).

59 Table 16. Strain identification from partial 16S rRNA gene sequencing. Strain bp* Comp (%) Closest Species Hom (%) Domain Bacteria Phylum Firmicutes Class Bacilli Order Bacillales Family Bacillaceae Genus Bacillus BBC|035 846 56.10% B. nakamurai 99.05% 55.29% B. vallismortis 99.05% Family Staphylococcaceae Genus Staphylococcus BBC|056 636 43.06% S. cohnii subsp. cohnii 98.50% BBC|077 964 65.27% S. cohnii subsp. cohnii 98.85% BBC|148 853 57.75% S. cohnii subsp. cohnii 99.06% BBC|169 895 60.60% S. cohnii subsp. cohnii 99.21% BBC|177 776 52.54% S. cohnii subsp. cohnii 98.71% BBC|205 814 55.11% S. cohnii subsp. cohnii 99.02% BBC|128 892 57.88% S. edaphicus 98.54% 61.26% S. saprophyticus subsp. bovis 98.54% 57.36% S. saprophyticus subsp. saprophyticus 98.54% Phylum Proteobacteria Class Betaproteobacteria Order Burkholderiales Family Comamonadaceae Genus Comamonas BBC|031 929 63.85% C. jiangduensis 99.03% Genus Limnohabitans BBC|024 812 54.13% L. australis 97.78% Class Gammaproteobacteria Order Family Genus BBC|016 883 58.75% A. veronii 98.86% BBC|118 907 60.35% A. veronii 98.64% BBC|034 684 45.91% A. enteropelogenes 96.36% Order Enterobacteriales Family BBC|027 334 --- (unclassified Enterobacteriaceae)# --- Genus Enterobacter BBC|003 792 52.98% E. cancerogenus 99.37% BBC|008 863 57.73% E. cancerogenus 98.72% Genus BBC|020 756 50.67% C. pasteurii 99.34% BBC|032 812 54.42% C. pasteurii 98.40% BBC|184 1008 64.82% C. portucalensis 97.51% Order Family Genus BBC|021 901 68.00% A. gerneri 96.23% BBC|029 993 68.01% Acinetobacter sp. WCHA34† 98.28% Family Genus BBC|005 583 38.92% P. kunmingensis 98.46% BBC|161 876 57.74% P. monteilii 98.17% BBC|186 869 56.87% P. knackmussii 97.93% 59.52% P. nitritireducens 97.93% Order Xanthomonadales Family Genus Stenotrophomonas BBC|170 844 54.91% S. rhizophila 98.70% * bp, base pair; Comp, completeness (according to [332]); Hom, homology. In bold the homology values above the 98.65% threshold. # The 334 bp of the 16S rRNA gene sequence of BBC|027 did not allow to perform the phylogenetic analysis, nor a consensus identification. † Acinetobacter sp. WCHA34 appears as a draft genome in Integrated Microbial Genomes and Microbiomes (IMG/M) and in NCBI; it was isolated from hospital sewage in China, and sequenced by Sichuan University in 2018, but does not have a valid publication. Sources: IMG/m, https://img.jgi.doe.gov/cgi-bin/m/main.cgi?section=TaxonDetail&page=taxonDetail&taxon_oid=2767802244. NCBI, https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=1879049.

60 Table 17. Homology matrix from 16S rRNA gene nucleotide-nucleotide comparison for all strains closely related to Staphylococcus cohnii subsp. cohnii. 1 2 3 4 5 6 7 1. BBC|056* 100.00% 2. BBC|077 98.50% 100.00% 3. BBC|148 98.13% 99.06% 100.00% 4. BBC|169 98.81% 99.21% 98.25% 100.00% 5. BBC|177 98.50% 98.96% 98.28% 99.07% 100.00% 6. BBC|205 98.43% 99.02% 98.77% 99.14% 98.51% 100.00% 7. Staphylococcus cohnii subsp. cohnii 98.50% 98.85% 99.06% 99.21% 98.71% 99.02% 100.00% * All the16S rRNA gene sequences were the ones used for the phylogenetic analysis. Values above the 98.65% threshold in bold.

In common with S. cohnii subsp. cohnii, all these strains are urease negative (S. cohnii subsp. urealyticum is urease positive). Except for BBC|148, all strains were able to grow, to varying levels, on lactose, whereas S. cohnii subsp cohnii is described as lactose negative, and S. cohnii subsp. urealyticum is lactose positive in 85% of the strains tested. Also, all strains in the working set are β-glucosidase positive (all strains grew with cellobiose as sole carbon source, even though not all of them showed positive results in the esculin test), whereas both subspecies of S. cohnii are considered β-glucosidase negative [333]. Staphylococcus cohnii is considered as part of the commensal microbiota, though the species can act as an opportunistic pathogen, and it is generally associated with hospitals and other healthcare facilities. It is known to cause bacteremia, and to be the causative agent of urinary tract infections (UTIs), endocarditis, and meningitis [334,335]. S. cohnii may also act as a reservoir for antibiotic resistant genes in the hospital environment [336].

Strain BBC|128, a Gram-positive coccus isolated from a canned fish industrial WWTP, also belongs to the genus Staphylococcus. It is phylogenetically related to S. edaphicus, and to both subspecies of S. saprophyticus, but its homology values with both species is below the threshold value. Differently from these two species [337–339], BBC|128 is able to grow on cellobiose, so it produces β-glucosidase. Staphylococcus saprophyticus is a known causative agent of uncomplicated UTIs in young female outpatients. In most of these infections, S. saprophyticus is sensible to common antimicrobials [340,341]. In rare cases, these UTIs may lead to bacteremia [342]. No reports on pathogenicity of S. edaphicus were found, but this species is closely related to both subspecies of S. saprophyticus (with 16S rRNA gene homology values above 99.70%). S. edaphicus is known to arbor the mecC gene, that encodes an alternative -binding MecC protein [339], giving S. edaphicus resistance to β-lactam antibiotics [339].

Strain BBC|031, a Gram-negative rod isolated from a WWTP feed, belongs to the genus Comamonas (Bacteria, Proteobacteria, Betaproteobacteria, Burkholderiales, Comamonadaceae), and is phylogenetically related to C. jiangduensis. Differently from the description of C. jiangduensis, BBC|035 is able to grow on arabinose, mannose, and glucose, and shows β-galactosidase, and β-glucosidase activities [343]. In a article from 2016 [210], C. jiangduensis showed promising results in a study on lignocellulose degradation, exhibiting CMCase, and xylanase activity. In the present work the strain BBC|031 did not show positive results for CMC hydrolysis, but showed positive results for xylan degradation. Species in the genus Comamonas may be opportunistic pathogens, responsible for bacteremia, and meningitis. These pathogens often belong to the species C. testosteroni, C. kerstersii, and C. aquatica [344]. No reports on pathogenicity were found of C. jiangduensis.

61 Strain BBC|024, isolated from the same source as BBC|031, belongs, as stated above, to the genus Limnohabitans, and is phylogenetically related to L. australis, though with 16S rRNA gene homology values below the threshold to be considered in the same species. No reports on pathogenicity of L. australis were found.

Strain BBC|016, isolated from the same source as BBC|024 and BBC|031 (WWTP feed), and strain BBC|118, isolated from an hydrocarbon separator, are Gram-negative rods that belong to the genus Aeromonas (Bacteria, Gammaproteobacteria, Aeromonadales, Aeromonadaceae), and are phylogenetically related to A. veronii. Both strains show higher homology with the type strain of the species (98.86% for BBC|016, and 98.64% for BBC|118), than with each other (98.28%). Differently from A. veronii, BBC|016 is a fast urease producer, and does not hydrolyze gelatin [345]. No differences between BBC|118 and the species description were found (within the tests related to this study). is a causative agent of diseases in fish [346,347]. A. veronii has been associated to gastroenteritis in humans, including traveler’s diarrhea [348]. A study from 1997 [349] referring to patients treated in Tokyo, Japan, reported that in more than 70% of the patients with traveler’s diarrhea due to aeromonads, the etiologic agent belonged to A. veronii. In Korea, the species was associated with bacteremia, alongside with A. hydrophila, and A. caviae [350]. Several other studies report various infections due to this species, from the two recognized biovars (veronii and sobria) [351–356].

Strain BBC|034, a Gram-negative rod isolated from the same source as BBC|035, belongs to the genus Aeromonas, and is phylogenetically related to A. enteropelogenes (with 96.36% homology, below the threshold value). Differently from the description of the species, published as A. trota, a synonym of A. enteropelogenes [357–359], BBC|034 is able to hydrolyze esculin, and lactose. Aeromonas enteropelogenes is a known waterborne human pathogen, responsible for gastroenteritis, cellulitis, and septicemia [346], and described as producing enterotoxins [360]. The species (as A. trota) was described as the etiologic agent in gastroenteritis [361], and in a wound infection leading to septic shock [362]. When compared to A. hydrophila, A. enteropelogenes shows a low prevalence associated with diarrhea in hospital settings, but this may be due to the recovery methodology in clinical specimens. A change in the recovery method, from blood agar supplemented with ampicillin to taurocholate tellurite gelatin agar (a medium without antibiotics), was translated in an increase in the prevalence of this species in children with diarrhea in Thailand, reaching frequency levels similar to A. hydrophila [360].

It was not possible to identify, with a fair degree of certainty, the strain BBC|027 below the rank family, since the 16S rRNA gene sequence resulting from the sequencing step was only 334 bp long The strain belong to the family Enterobacteriaceae (Bacteria, Proteobacteria, Gammaproteobacteria). RDP classifier returned “unclassified Enterobacteriaceae” with an 80% confidence threshold. In the first 10 hits in EZbioCloud (against valid names), appeared three species of the genus Klebsiella, six species of the genus Citrobacter, and one species of the genus Yokenella (all Enterobacteriaceae), but BBC|027 has the highest homology value (82.37%) with Klebsiella michiganensis. In the BLASTn suite of NCBI against the 16S ribosomal RNA (Bacteria and Archaea) database, the first hits (with 85.17% identity) included the species Gibbsiella greigii, Raoultella electrica, and . The two best hits against the Nucleotide collection (nr/nt) database were an uncultured bacterium (accession number HM847492), isolated from a diabetic wound in a mouse

62 (Mus musculus), with 86.12% identity; and Klebsiella sp. RPWB1.1 (accession number KC584759), isolated from the beetle Rhynchophorus ferrugineus, with 86.06% identity. Species in the genus Klebsiella are frequently associated with infectious diseases in humans, the most important being K. pneumoniae, a species that, depending of the strain, can be opportunistic, hypervirulent, or a multidrug resistant pathogen [363]. K. michiganensis and K. oxytoca are emerging human multidrug resistant pathogens [364,365]. Gibbsiella greigii, alongside with G. quercinecans, are plant pathogens associated with oak decline [366]. Species in the genus Raoultella are opportunistic pathogens, frequently associated with immunocompromised patients [367]. No reports on pathogenicity on R. electrica were found, but the species was associated with food contamination [367].

Strains BBC|003 and BBC|008, Gram-negative rods isolated from a petrochemical WWTP, belong to the genus Enterobacter (Bacteria, Proteobacteria, Gammaproteobacteria, Enterobacteriaceae), and are phylogenetically related to E. cancerogenus, a species originally described as Erwinia cancerogena, isolated from Populus trees affected with canker disease [368]. Contrary to E. cancerogenus, both strains are able to grow on glycerol, and on sorbitol [368]. Both strains show higher homology to the type strain of E. cancerogenus (99.37% for BBC|003, and 98.72% for BBC|008), than with each other (98.36%). Enterobacter cancerogenus (as E. taylorae, a synonym of E. cancerogenus [369]) was recognized as an opportunistic pathogen in 1993, causing bacteremia, cholangitis, , and UTIs [370]. The species has also been reported in infected wounds, leading to bacteremia and septicemia [371], osteomyelitis (infection of bone) [372], and in community-acquired pneumonia [373].

Strains BBC|020 and BBC|032, Gram-negative rods isolated, respectively, from the feed of a WWTP, and from WWTP activated sludge, belong to the genus Citrobacter (Bacteria, Proteobacteria, Gammaproteobacteria, Enterobacteriaceae), and are phylogenetically related to C. pasteurii. Both strains tested negative in the esculin test, and were able to grow on cellobiose, differently from what is described for C. pasteurii [374]. BBC|020 shows higher homology with C. pasteurii (99.34%), than to BBC|032 (98.54%), but BBC|032 is closer to BBC|020, than with C. pasteurii (98.40%). No reports on C. pasteurii pathogenicity were found. The species is closely related to C. freundii, a species associated with urinary tract infections, and with skin and soft tissues infections [375].

Strain BBC|184, a Gram-negative rod isolated from an industrial WWTP, the same source as BBC|177, also belongs to the genus Citrobacter, and is phylogenetically related to C. portucalensis, with 97.51% homology (below the threshold value). Differently from C. portucalensis [376], BBC|184 is able to grow on arabinose, and is able to degrade starch, but is not able to grow not on sorbose, nor myo-inositol. No reports on pathogenicity on C. portucalensis were found. The species was only recently described (in 2017), and was isolated from a water well [376].

Strain BBC|021, a Gram-negative rod isolated from the same source as BBC|024, and BBC|031, belongs to the genus Acinetobacter (Bacteria, Proteobacteria, Gammaproteobacteria, Pseudomonadales, Moraxellaceae), and is phylogenetically related to A. gerneri, even if with a homology value (96.23%) below the threshold value. Contrarily to A. gerneri [377], BC|021 is able to grow on galactose, and on myo-inositol.

63 The genus Acinetobacter harbors several pathogenic species, from which the most important are A. baumannii, (a species phylogenetically close to BBC|021) and A. calcoaceticus. These species are largely associated with infections in patients in intensive care units. A. baylyi, also closely related to A. baumannii, is reported as an opportunistic pathogen. Most hospital infections attributed to Acinetobacter are not further investigated in order to identify the species [378–380], so the species of the etiologic agents remain unknown. In the genome annotation of the type strain of A. gerneri were found 49 genes related to antibiotic, and toxic compounds resistance, including 12 genes for multidrug resistance efflux pumps (MDR) [381]. A. gerneri may be an asset in bioremediation. In 2012 [382] the strain A. gerneri P7 was studied for its ability to degrade a type of polyurethane (Impranil DLNTM).

Strain BBC|029, a Gram-negative rod isolated from the same source as BBC|016, BBC|020, BBC|021, BBC|027, and BBC|031, also belongs to the genus Acinetobacter. It is closely related to a strain isolated from a hospital sewage in China, Acinetobacter sp. WCHA34, with 98.28% homology. The closest valid species are A. gandensis, with 97.15%, and A. schindleri, with 96.26% homology, respectively. All these values fall below the threshold of 98.65%.

Strain BBC|005, a Gram-negative rod isolated from the same source as BBC|003 and BBC|008, belongs to the genus Pseudomonas (Bacteria, Proteobacteria, Gammaproteobacteria, Pseudomonadales, Pseudomonadaceae), and is phylogenetically related to P. kunmingensis, with 98.46% homology. Differently from this species [383], BBC|005 is able to grow on fructose, xylose, and, even though weakly, on galactose. No reports on P. kunmingensis pathogenicity were found. A strain of the species, P. kunmingensis L3, is able to degrade phenanthrene, a recalcitrant aromatic hydrocarbon [384].

Strain BBC|161, a Gram-negative rod isolated from the same source as BBC|169, BBC|177, and BBC|184, also belongs to the genus Pseudomonas, and is phylogenetically related to P. monteilii. Differently from this species [385], BBC|161 is able to grow on mannose, lactose, arabitol, mannitol, and on ribitol, and tested positive for tributyrin hydrolysis. Pseudomonas monteilii [385] was isolated from clinical specimens, and has been reported as an emerging pathogen, in respiratory diseases [386], and in meningoencephalitis [387]. Several strains isolated from hospital settings have been reported to produce metallo-β-lactamases (MBLs), namely VIM-2 [388], and IMP-13 [389], and may function as reservoir of antimicrobial resistance genes. Despite being an emerging pathogen, a study from 2014 found that an isolate of the species, P. monteilii PsF84 [390] helped promote plant growth in rose-scent geranium (Pelargonium graveolens), by solubilizing inorganic phosphates, and producing siderophores, playing a potential role in phytoremediation of metal-contaminated soil.

Strain BBC|186, a Gram-negative rod isolated from used oil from a laboratory pump, also belongs to the genus Pseudomonas, and is phylogenetically related to P. knackmussii, and P. nitritireducens, with 97.93% homology (for both). Differently from these two species [391,392], BBC|186 was able to grow on cellobiose (BBC|186 showed the highest relative growth of all strains on cellobiose as sole carbon source, 128%), and on sorbitol. BBC|186 is also able to grow on galactose, as P. nitritireducens, but this species is negative for lactose, whereas BBC|186 showed strong positive growth on this disaccharide (with 75% of relative growth). The description of

64 P. knakmussii lacks information about the use of lactose. No reports on P. nitritireducens nor on P. knackmussi pathogenicity were found. Both species show potential for biotechnological application. P. nitritireducens, described in 2012 [392], plays a role in denitrification. The species is closely related to P. nitroreducens, which is able to release gaseous nitrogen from nitrates [393,394], and P. knackmussii B13T is able to degrade chloroaromatic compounds [395], including 3-chlorobenzoate [391].

Strain BBC|170, a Gram-negative rod isolated from the same source as BBC|169 and BBC|177, belongs to the genus Stenotrophomonas (Bacteria, Proteobacteria, Gammaproteobacteria, Xanthomonadales, Xanthomonadaceae), and is phylogenetically related to S. rhizophila, with 98.70% homology. No differences were found between BBC|170 and the species description [396], within the tests related to this study. No reports on S. rhizophila pathogenicity were found. The species is normally associated with plants, providing protection, and promoting plant growth [396–398]. A closely related species, S. maltophilia, is a known opportunistic pathogen, associated, among others, to respiratory infections of nosocomial origin [399]. The genus Stenotrophomonas shows potential biotechnological application in the degradation of xenobiotics (examples in [209,400,401]).

In summary, the family with higher representation in the working set was the Staphylococcaceae, with seven strains, followed by the Enterobacteriaceae, with six strains. The working set includes strains from Staphylococcus spp. (7), Aeromonas spp. (3), Citrobacter sp. (3), Pseudomonas spp. (3), Acinetobacter spp. (2), Enterobacter sp. (2), Bacillus sp. (1), Comamonas sp. (1), Limnohabitans sp. (1), and Stenotrophomonas sp. (1), as well as an unclassified Enterobacteriaceae strain. None of the presumptive species from the working set is a strict pathogen. Among the species described as opportunistic pathogens, Staphylococcus cohnii is generally only associated with hospitals, the species in the genera Aeromonas, and Enterobacter are responsible for community, and hospital acquired infections, and Pseudomonas monteilii is an emergent pathogen with potential use in phytoremediation.

3.4.2. Comparison between closely phylogenetically related strains In the dendrogram generated from the phenotypic characterization, strains BBC|003 and BBC|008 appear associated in cluster I, separated by less than 6% dissimilarity. These two strains were identified as Enterobacter sp. closely related to E. cancerogenus, though they only share, between each other, 98.36% homology in the 16S rRNA gene sequence. The main differences between these two strains are: 1) the ability of BBC|003 to grow, even though weakly, on sorbose, and on ribitol, but not on galactitol, whereas BBC|008 is not able to grow on sorbose, nor on ribitol, but is able to grow on galactitol; 2) the response to the presence of sodium chloride (BBC|003 shows optimal growth in the complete absence of sodium chloride, whereas BBC|008 grows better with 1-3% w/v of sodium chloride); 3) the range of pH which the strains show optimal growth (BBC|003 shows higher NAUCs when grown in pH 7, whereas BBC|008 shows similar growth in pH 5 and pH 7); and 4) in the urease activity (fast positive for BBC|003, negative for BBC|008).

Strains BBC|032 and BBC|020, identified as Citrobacter sp. closely related to C. pasteurii, appear in the dendrogram in cluster II, and cluster III, respectively, with more than 24% dissimilarity. The main differences between these two strains are: 1) their ability to grow on methyl butyrate, and on several

65 sugar alcohols as sole carbon source; 2) the best growth conditions (medium, temperature, and pH); and 3) in the cellulase activity. BBC|032 is able to grow on arabitol, galactitol, and ribitol, but not on glycerol, nor on methyl butyrate; in turn, BBC|020 is able to grow on glycerol, and methyl butyrate, but not on arabitol, galactitol, nor ribitol. BBC|032 grows better in TSB, between 20 °C and 28 °C, and at pH 5, whereas BBC|020 grows better in BHI, shows similar growth between 20 °C and 37 °C, and grows better at pH 7. Finally, BBC|032 is one of the strains with the best results for CMC degradation, whereas BBC|020 is not able to degrade CMC.

Strains BBC|016 and BBC|118, identified as Aeromonas sp. closely related to A. veronii, appear in the dendrogram, in the clusters III and IV, respectively, with approximately 23% dissimilarity. Strain BBC|016 is able to grow on sorbose, and on myo-inositol as sole carbon source, whereas BBC|118 is not. In turn, BBC|118 is able to grow on ribitol, Tween 20, and on Tween 40, whereas BBC|016 did not grow on these substrates. Strain BBC|016 shows higher NAUC when grown at 37 °C, and does not show growth differences between 0% and 3% w/v of sodium chloride, whereas BBC|118 grows better at 20 °C, and in 1% w/v of sodium chloride. BBC|016 shows pectinase, and urease activity, but is not able to hydrolyze casein, nor gelatin, whereas BBC|118 does not show pectinase, nor urease activity, but shows positive results in the two tests referring to peptidase activity.

Strains BBC|169, BBC|177, and BBC|205 (cluster II), and strains BBC|056, BBC|077, and BBC|148 (cluster III), were identified as Staphylococcus sp. closely related to S. cohnii subsp. cohnii. The strains show differences in the use of sugars, and sugar alcohols, as sole carbon source, and in the response to the presence of salt. All strains in cluster II show higher NAUCs at 28 °C, whereas the strains in cluster III grow better at 20 °C. The main differences between these strains are in terms of their enzymatic activity: only strains BBC|205 and BBC|148 are able to hydrolyze DNA; strain BBC|148 is the only one who does not show β-galactosidase activity; strains BBC|177, BBC|056 and BBC|148 show dextranase activity (the only ones in all the working set); strains BBC|169, BBC|177, BBC|077, and BBC|148 are able to degrade CMC; strains BBC|169, BBC|205, and BBC|056 are able to degrade pectin; and strain BBC|148 is the only one in the group who show peptidase activity, limited to the skim milk test.

As stated by Stackebrandt in 1992 [402], and illustrated from this small set of strains, phylogenetic relatedness is not a robust proxy for phenotypic congruence, thus justifying the investment in a thorough physiological and ecological characterization of strain collections, when selecting bacteria for biotechnological applications.

3.5. Case study – Urea and ammonium removal for wastewater treatment 3.5.1. Preliminary test – growth in synthetic wastewaters Six strains in the working set showed fast urease results in the hydrolase screening assay: BBC|003, BBC|016, BBC|020, BBC|027, BBC|128, and BBC|184. From these, five strains were used in the preliminary test. Strain BBC|027 was not considered for the case study because it was not possible to assign the strain to a definitive genus, as discussed in section 3.4.1.

Strain BBC|128 was not able to grow in synthetic wastewater, whether with urea, or with ammonium chloride, and strain BBC|184 showed slow growth (low turbidity) in both synthetic wastewaters. Both strains

66 were not selected for the following tests, so the urea and ammonium removal study was performed with three strains (Enterobacter sp. BBC|003, Aeromonas sp. BBC|016, and Citrobacter sp. BBC|020).

3.5.2. Growth in synthetic wastewater with urea or ammonium as sole nitrogen source All strains are able to grow in both synthetic wastewaters, confirming the results in the preliminary test (see Figure 21). A BBC|003 B BBC|016 C BBC|020 D Consortium 11 11 11 11 10 10 10 10 9 9 9 9

8 8 8 8

cfu/ml cfu/ml cfu/ml cfu/ml

10 10 10

7 7 10 7 7

Log Log Log 6 Urea 6 Urea Log 6 Urea 6 Urea AC AC AC AC 5 5 5 5 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Days Days Days Days

Figure 21. Viable cell count (log10 cfu/ml) over time in synthetic wastewater with urea or ammonium chloride as sole nitrogen source. A) Growth curves for strain BBC|003. B) Growth curves for strain BBC|016. C) Growth curves for strain BBC|020. D) Growth curves for the consortium composed of strains BBC|003, BBC|016, and BBC|020. Growth with urea in blue. Growth with ammonium chloride in orange. Each point represents the mean MPN ± STDEV from three replicates. AC, ammonium chloride.

The highest mean MPN of viable cells was achieved by strain BBC|003, using ammonium chloride as nitrogen source, at 24 h of incubation (≈ 1.10x1010 cfu/ml). Using urea as nitrogen source, the highest mean MPN was also observed for BBC|003 (≈ 3.39x109 cfu/ml), at the fourth day of the test, though a plateau in the number of viable cells from day 2 through day 6 is discernible from the graphic in Figure 21A. All strains showed an increase of at least an order of magnitude (1-log increase) in the number of viable cells per milliliter in relation to the beginning of the test (see Table 18). The same was observed for the consortium. Strain BBC|003, in axenic conditions, showed the overall higher number of viable cells, for both synthetic wastewaters, and the number remained high throughout the experiment. This was not observed for the other two strains, in which the final value was inferior to the value in the beginning of the experiment (for these strains the death phase started before the fourth day of incubation). In a scenario of synergistic behavior, it is expectable that the presence of multiple bacterial strains leads to higher growth and survival rates. Inversely, in a scenario of antagonistic behavior, it is expected that the growth rate is slower, or the mortality rate higher.

Table 18. Viable cell count (log10 cfu/ml) of strains BBC|003, BBB|016, BBC|020, and a consortium off the three strains, in synthetic wastewater with urea or ammonium chloride as sole nitrogen source. Viable Cell Count Initial* Final Higher Day Lower Day BBC|003 Urea 8.24* 9.47 9.53 4 8.24 0 Ammonium chloride 8.19* 9.50 10.04 1 8.19 0 BBC|016 Urea 8.11* 6.55 9.47 2 6.55 6 Ammonium chloride 8.16* 6.44 9.24 1 6.44 6 BBC|020 Urea 8.35* 6.86 9.45 2 6.86 6 Ammonium chloride 8.10* 6.40 9.27 2 6.40 6 Consortium Urea 8.30* 8.57 9.47 4 8.30 0 Ammonium chloride 8.40* 8.91 9.68 2 8.40 0

* Results expressed in log10 cfu/ml. The values were computed from the mean MPN from three replicates.

67 Over time, the number of viable cells in the consortium was roughly similar to the number of viable cells found for strain BBC|003 in axenic culture, and the slow mortality rate was equally observed. A slow mortality rate is a highly desirable feature in wastewater treatments based on aerobic suspended growth processes, like the activated sludge. An increasing residence time of the microorganisms is often linked to a lower sludge production, due to endogenous respiration [403], which translates into lower maintenance costs.

When growing in TSA, at 28 °C, for 24 h, strain BBC|003 develops small sized, circular, smooth, white, brilliant, and slightly raised colonies; strain BBC|016 develops small to medium sized colonies, with a slightly irregular shape, whitish, brilliant, flat, and with a white spot in the center; and strain BBC|020 develops medium to large sized colonies, with irregular shape, whitish, brilliant, flat, and with a white spot in the center. Though the colonies from strains BBC|16 and BBC|020 are somewhat similar, the differences in shape and size allow the distinction between strains. All three strains in the consortium bottles were detected throughout the experiment, in all the triplicates. The MPN does not allow to know which strains are present in the culture, so it was not possible to discern the relative abundance of each strain in the consortium. This way, it was not possible to infer the effect of other strains in one strain in particular. However, the long-term results in Figure 21 suggest that the consortium values are due to strain BBC|003, since the other two strains show a decline in the MPN after day 3, whereas BBC|003 does not. This ability to grow, independently of the presence of other bacterial strains, is highly advantageous when considering wastewater treatment, as the effluent in a WWTP generally brings a high diversity of microorganisms, including pathogens and multi-resistant bacteria [404].

The bottles with strain BBC|003, when grown individually, and when grown in consortia, at 24 h of incubation started to show noticeable bacterial aggregates in suspension (this behavior was not observed when BBC|003 grew in TSB, a rich medium). The aggregates grew in size over time, and tended to sink once the shaking stopped. The aggregates also seem to show higher stability when BBC|003 was in the presence of other strains. Through bright field microscopy, it was possible to verify that the aggregates in the consortium were formed by at least two types of bacteria. BBC|003 is a short rod-shaped bacterium, and the aggregates in the consortia showed a high concentration of short rods in the center, surrounded by long rods, characteristic of the other two strains in the consortium. With these observations, it was not possible to delineate if the aggregates were made of two, or of three strains. To overcome this limitation in the bright field microscopy, an aggregate of these bacteria could have been incubated in a plate of rich medium, like TSA, so that the strains could be effectively identified. This procedure, though, would only allow to assess which strains were viable, and not necessarily, which strains were effectively present in the aggregate. Due to the formation of bacterial aggregates, the effective value of viable cells per milliliter can be much higher than the cfu/ml that is determined by MPN in the cultures with the strain BBC|003. The aggregates do not separate easily (even with intense vortexing), so a cfu can be somewhere between one, and several hundred (or even more) bacterial cells. In these cases, the computed number of viable cells must be met with caution, because the viable cells are planktonic, i.e., living free in suspension, and sessile, in the multicellular aggregates (otherwise the aggregates would not grow in size over time). All the viable cells are

68 using the resources in the media, and excreting their metabolic waste, leading to an increasing competition between cells, which is probably advantageous for the cells in the aggregates (in stress conditions, cells in aggregates tend to show higher metabolic activity [405,406]). It is expected that the increasing stress conditions will, over time, lead to the development of bigger aggregates, as long as nutrients are available. Flocculants are used in wastewater treatment to promote the aggregation of suspended solids, but also to aid floc formation in the activated sludge [403]. Some bacterial strains naturally promote the formation of microbial aggregates, generally by producing extracellular polymeric substances (EPS) [407], which minimizes the need of flocculants, thus lowering maintenance costs. The ability to promote the formation of aggregates shown by strain BBC|003 is, therefore, another highly desirable feature in wastewater treatment.

3.5.3. Respiration assay The differences in the respiratory debit between BOD bottles incubated with different inocula are attributable to differences in the growth rate. A faster oxygen consumption corresponds to a faster growth. This allows to determine the best candidate (individual strain or consortium) for urea utilization as sole nitrogen source, and to assess differences in the utilization of urea or ammonium chloride.

Within each strain, the mean AUCs throughout the assay were similar for both synthetic wastewaters (with ammonium chloride or urea as sole nitrogen source). The same was observed for the consortium (see Table 19). The results indicate that the strains do not show clear differences in the use of ammonium chloride or urea as nitrogen source, confirming the results of the growth assay discussed above.

Table 19. Mean AUC values for each 24 h interval in the respiration assay. BBC|003 BBC|016 BBC|020 Consortium Time AC* Ur* AC* Ur* AC* Ur* AC* Ur* 24 h 7819 9823 9555 7329 11281 11827 12279 12900 48 h 49975 45879 50644 48734 57229 55163 58947 64814 72 h 125759 127631 102192 100422 119539 112972 121222 134093 96 h 214753 231900 156864 154475 188511 175950 190291 215772 * The table shows the mean AUC from triplicates. AC, ammonium chloride. Ur, urea.

The media became turbid for all bottles, and all strains, and the consortium, showed equal or higher viable cell counts in the end of the assay, indicating that the cells grew and remained viable throughout the experiment. At 24 h of incubation, strains BBC|003 and BBC|016 showed lower AUCs for urea than for ammonium chloride. The strains must produce ureases so they can use the urea as nitrogen source, so their initial growth may be slower (as the ammonium chloride is a readily available nitrogen source). At the end of the assay, BBC|003 showed the higher AUCs for both synthetic wastewaters, closely followed by the consortium, as was also observed for the growth assay.

3.5.4. Urea and ammonium tolerance Due to the promising results from the growth and respiration assays, the growth of strain BBC|003 in increasing concentrations of urea or ammonium chloride was studied. The results are shown in Figure 22. The graphics show that the presence of up to 5.00% w/v of urea, or up to 3.00% w/v of ammonium chloride does not hinder the BBC|003 growth.

69 BBC|003 BBC|003 A B 10 10 8 8 0.00% Urea

6 1.66% Urea 6 cfu/ml

cfu/ml 3.33% Urea 10 4 5.00% Urea 10 4 0.00% AC

6.66% Urea 3.00% AC Log Log 2 8.33% Urea 2 6.00% AC 0 10.00% Urea 0 9.00% AC 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Days Days Figure 22. Viable cell count (log10 cfu/ml) over time of strain BBC|003 in TSB supplemented with increasing concentration of urea or ammonium chloride. A) Growth curves in TSB supplemented with 0.00% to 10.00% w/v of urea in 1.66% increments. B) Growth curves in TSB supplemented with 0.00% to 6.00% w/v of ammonium chloride, in 3.00% increments. TSB+3.00% w/v of ammonium chloride corresponds roughly, in terms of added nitrogen, to TSB+1.66% w/v of urea, TSB+6.00% w/v of ammonium chloride to TSB+3.33% w/v of urea, and TSB+9.00% w/v of ammonium chloride to TSB+5.00% w/v of urea. All media became turbid (indicative of bacterial growth), except for TSB+8.33% w/v of urea, TSB+10.00% w/v of urea, and TSB+9.00% w/v of ammonium chloride. Each point represents the mean MPN ± STDEV from three replicates. AC, ammonium chloride.

The graphics also show that, even though the MPN is somewhat similar for 6.66% and 8.33% w/v of urea, the medium became turbid in the lower concentration (indicating bacterial growth), but remained clear in the higher (indicating that, even though the medium remained with viable cells throughout the assay, these cells were not able to grow in the medium). The same was observed for 6.00% and 9.00% w/v of ammonium chloride. From the results, it is possible to infer that, in high concentrations, the ammonia has a higher toxicity effect than urea, as BBC|003 still does not show alteration in the growth with a concentration of urea that, in terms of total nitrogen present in solution, is roughly the double of the ammonium chloride. In other words, BBC|003 can withstand up to twice the amount of added nitrogen when it is in the form of urea, than when the nitrogen is in the form of ammonium. This may be due to alterations in the pH of the medium with the increased concentration of ammonium. For 1.66%, 3.00%, and 5.00% w/v of added urea, and for 3.00% w/v of added ammonium chloride, the number of viable cells increased around two orders of magnitude (see Table 20 and Table 21), a much higher value that what was observed in the synthetic wastewater. This was an expected result, as TSB is a richer medium than the synthetic wastewaters.

Table 20. Viable cell count (log10 cfu/ml) of strain BBC|003 in TSB with added urea. Viable Cell Count Initial* Final Higher Day(s) Lower Day 0.00% Urea 6.85* 9.50 9.92 2 6.85 0 1.66% Urea 7.66* 9.79 9.79 6 7.66 0 3.33% Urea 7.66* 9.79 9.92 4 7.66 0 5.00% Urea 7.56* 9.43 9.57 2, 4 7.56 0 6.66% Urea 6.71* 4.57 6.93 2 4.57 6 8.33% Urea 7.32* 4.57 7.32 0 4.57 6 10.00% Urea 7.17* 1.78 7.17 0 1.74 4

* Results expressed in log10 cfu/ml. The values were computed from the mean MPN from three replicates.

This assay was performed with a mean of ≈ 2.97x107 cfu/ml in the beginning of the urea test, and ≈ 4.64x107 cfu/ml in the beginning of the ammonium chloride test. These concentrations are roughly inferior to the concentration of bacterial cells at wastewaters, in the order of 108 cells/ml, but still in the same order of magnitude, and significantly inferior to the bacterial concentration in sludge, in the order of 1010 cells/ml [408].

70 Table 21. Viable cell count (log10 cfu/ml) of strain BBC|003 in TSB with added ammonium chloride. Viable Cell Count Initial* Final Higher Day Lower Day 0.00% Ammonium chloride 6.85* 9.50 9.92 2 6.85 0 3.00% Ammonium chloride 7.63* 9.23 9.92 1 7.63 0 6.00% Ammonium chloride 7.55* 7.17 7.55 0 6.51 2 9.00% Ammonium chloride 7.75* 6.50 7.75 0 6.50 6

* Results expressed in log10 cfu/ml. The values were computed from the mean MPN from three replicates.

Expected values for a WWTP feed range from 125 mg/l of ammoniacal nitrogen, and 750 mg/l of urea in a 1000 tons per day in a WWTP in India (where urea is extensively used as fertilizer) [137], up to more than 21 g/l of urea, in industrial wastewater [409]. The values on the treatment lines can be very different from these, though. In terms of ammoniacal nitrogen, the values may reach up to 1.38 mg/l in the wastewater treatment line, 590 mg/l in the anaerobic treatment line, and 1500 mg/l of in the sludge treatment line [410]. The strain BBC|003 is able to grow, in rich medium, with up to 5.00% w/v of urea, and up to 3.00% w/v of ammonium chloride, far exceeding the higher values presented above.

3.6. Case study – Cellulose degradation for wastewater treatment

Cellulose from wastewaters removal/valorization has been the subject of several studies in recent years (examples in [411–414]). One strategy to lower the cellulose content in the wastewater effluent that undergoes biological treatment, is through filtering, namely by using fine mesh sieves during the pretreatment [415], with several systems being developed or already marketed. Sieving is able to remove between 50% and 80% of the suspended solids [415]; the remaining cellulose passes for the next phase of wastewater treatment, where is hydrolyzed, and eventually, used as carbon source during the biological treatment.

None of the strains in the CMCase screening were able to degrade CMC when this cellulose derivative was the only carbon source available. As CMCase activity was only observed when the media contained more readily available carbon sources, for the study of the CMC and the Avicel degradation, it was required to provide an initial carbon source to allow the production of the cellulose degrading machinery, so a 1/10 strength TSB was used as base medium in the assay.

In the screening assay, four strains showed strong positive results for CMC degradation: BBC|032, BBC|077, BBC|169, and BBC|177. Strain BBC|032 was selected because it is phylogenetically related to Citrobacter pasteurii, and was sampled from activated sludge. The remaining three strains are phylogenetically related to Staphylococcus cohnii subsp. cohnii. Strain BBC|077 was selected for the cellulase degradation test based on its isolation origin (food industry WWTP), since the other two strains were isolated from the same sampling episode, in an industrial WWTP. Both selected strains showed strong positive results in the CMC degradation test, and negative results in the esculin hydrolysis test. The results in the utilization of cellobiose as sole carbon source were different: BBC|032 showed strong positive growth (76% of the reference growth, in TSB), whereas BBC|077 showed positive growth (with only 32% of the reference growth).

The cellulose degradation capacity was compared between individual strains, and between individual strains and the same strains when in consortium, by assessing their bacterial growth over time.

71 Both strains were able to grow in the base medium, and in the base medium with added CMC or Avicel (see Figure 23).

A BBC|032 B BBC|077 C Consortium 11 11 11 BM 10 10 BM+CMC 10 9 9 BM+Avi 9 8 8 8

7 7 7

cfu/ml cfu/ml cfu/ml

10 10 6 10 6 6

5 BM 5 5 BM

Log Log Log 4 BM+CMC 4 4 BM+CMC BM+Avi BM+Avi 3 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Days Days Days Figure 23. Viable cell count (log10 cfu/ml) over time in base medium, and base medium with added 1% w/v of CMC or Avicel. A) Growth curves for strain BBC|032. B) Growth curves for strain BBC|077. C) Growth curves for the consortium composed of strains BBC|032 and BBC|077. Growth in base medium in blue. Growth in base medium with added CMC in orange. Growth with added Avicel in grey. Each point represents the mean MPN ± STDEV from three replicates. BM, base medium (1/10 strength TSB); CMC, carboxymethyl cellulose; Avi, Avicel.

Strain BBC|077 showed consistently lower viable cell count values, even in the beginning of the experiment. This may be because BBC|077 is a Gram-postive cocci. The cells do not easily separate, even with intense vortexing, as with the aggregates made by strain BBC|003 in stress conditions. As a cfu for BBC|077 is an undetermined number of bacterial cells, the MPN for this strain may be significantly lower than the real number of viable cells in culture. Despite the discrepancy for strain BBC|077, both strains in the assay, and also the consortium, showed at least an increase of one order of magnitude in relation to the beginning of the experiment (see Table 22).

Table 22. Viable cell count (log10 cfu/ml) of strains BBC|032, BBC|077, and a consortium of the two strains, in base medium, and base medium with added 1.00% w/v of CMC or Avicel. Viable Cell Count Initial** Final Higher Day Lower Day(s) BBC|032 BM 8.44* 7.18 9.19 1 7.18 9 BM+CMC 7.42* 7.77 9.60 1 7.42 0 BM+Avi 7.90* 7.46 9.50 1 7.46 9 BBC|077 BM 4.37* 4.37 5.71 3 4.37 0, 9 BM+CMC 4.37* 4.37 5.46 3 4.37 0, 7, 9 BM+Avi 4.37* 4.37 6.37 1 4.37 0, 5, 9 Consortium BM 7.31* 7.41 9.04 1 7.31 0 BM+CMC 7.55* 7.24 9.46 1 7.24 9 BM+Avi 7.79* 6.77 9.41 1 6.77 9 * The values were computed from the mean MPN from three replicates. BM, base medium (1/10 strength TSB); CMC, carboxymethyl cellulose; Avi, Avicel.

The highest mean MPN was achieved by strain BBC|032, when growing in base medium with CMC, at 24 h of incubation (≈ 5.27x109 cfu/ml), followed by the same strain in base medium with Avicel, also at 24 h of incubation (≈ 3.57x109 cfu/ml). For the consortium, the highest values were also obtained for base medium with CMC, followed by base medium with Avicel. Strain BBC|077 shows a peak for base medium with Avicel, also at 24 h, but for the remainder of the test the MPN is similar for the three media. The graphics in Figure 23 suggest that the addition of CMC or Avicel leads to a higher growth in strain BBC|032 (in relation to the growth in the base medium), but not for strain BBC|077 nor for the consortium. When growing in TSA, at 28 °C, for 24 h, strain BBC|032 develops medium sized, round, smooth, orange, brilliant, mucoid colonies; strain BBC|077 develops small sized, round, smooth, white colonies. The

72 differences in the morphology of the colonies allow the easy differentiation between strains. Both species in the consortium were detected throughout the experiment, in all the triplicates. As discussed above, the MPN does not allow to infer the relative abundance of each strain in the consortium. The results in Figure 23, though, suggest that the consortium values are due to strain BBC|032.

The results of this assay suggest that strain BBC|032 seems to be able to use the additional carbon source in the supplemented media to grow, thus being able to use CMC and Avicel.

73 4. General Conclusions and Future Perspectives

Even though alternative methods are available, screening for hydrolase-producing microorganisms using catalysis assays with specific substrates is still a viable option. Metagenomic screening may lead to the discovery of novel genes coding for enzymes, but there are no guarantees that these novel genes lead to the production of active enzymes. Catalysis assays are more or less expedited processes to find strains which, in fact, can produce the desired enzymes, and that are able to activate metabolic pathways to hydrolyze macromolecules. A good experimental design can even, under certain circumstances, allow to discern if the enzymes are inducible or constitutive, or in which conditions are achieved high activity levels.

With this study, of a working set with 25 bacterial strains, at least one strain of man-made environment origin (BBC|003 was isolated from a petrochemical wastewater treatment plant), showed high potential for future application in wastewater treatment, due not only to its ability to use urea as nitrogen source, and its tolerance to high concentrations of urea, and ammonium, but also to its ability to form natural microbial aggregates in stress conditions. Although the WWTP environment conditions were not specifically tested in this work, the WWTP treatment lines are typically stressful environments that might lead to the formation of desirable microbial aggregates by this strain, thus minimizing the need of flocculants. Further testing is required to verify if strain BBC|003 is suitable for wastewater treatment, including tests to verify if the bacteria is able to grow in domestic, agricultural, or slaughterhouse wastewaters (the most common wastewaters with potential to show high content in urea, and in ammonia, from feces). Tests in synthetic wastewater and in WWTP samples with added urea and ammonia should be performed to assess the range of ammonia and urea in which the strain can effectively grow. The tests should also be performed in conditions closest to the ones found in WWTP (like lower temperature, and tests without shaking, to mimic the conditions of low-energy WWTP). If these tests confirm the suitability of BBC|003 for wastewater treatment, it is necessary to delineate the best way to deliver the strain to the system (examples include freeze-dried powder, high-concentration suspension, or even immobilized bacteria). Even if it turns out that BBC|003 is not suitable to be used in wastewater treatment, it is always possible to resort to the extraction of desirable enzymes for posterior use. Strain BBC|032 should also be further studied, to assess its effective potential for cellulose degradation. The results already obtained show that man-made environments, especially highly contaminated sites, are an interesting source of novel strains with biotechnological potential. The diversity of hydrolase activities found in this small group of strains (within the group, and within each strain) is also testimony of the biotechnological potential of these environments. Even though only bacterial strains were studied, there are other microorganisms in this type of environments that probably show interesting qualities, one of them being the fungi (both filamentous and yeasts), known to show desirable degradative capacity.

In this work consortia were used to test the degradation of a single compost or type of molecule. The results from both case studies did not suggest that, in this case, the consortium attained better results. The

74 hydrolases produced by the different strains did not seem to complement each other, because the consortia results were not better than at least one of the individual strains. Nevertheless, there are multiple cases in which consortia lead to better growths/higher degradation rates (examples in [196,206,210,211,416,417]). As the microbial aggregates promoted by BBC|003 tend to show higher stability when they are composed with two or more strains, it would be interesting to study how the strain behaves in the in the presence of bacterial strains commonly detected in wastewaters. It would also be interesting to test consortia with producers of several different hydrolases (like cellulases, β-glucosidases, amylase, CEHs, and ureases) to see if they could show synergistic behavior and be used for degradation of more than one type of molecule (ex., starch, cellulose, or lipids as carbon source, and urea as nitrogen source). A consortium of bacteria with these degradative capacities could be used not only in wastewater treatment, but also in sludge treatment, and in compost production.

In this work only wastewater treatment was studied. There are other types of wastes that can benefit from the use of hydrolase-producing organisms. The strains with strong pectinase activity could be tested to analyze their potential in the degradation of pectin-rich wastes, like fruit peels from the production of juices, during compost production. This decreases the necessity for landfills or waste incineration, as compost can be used as fertilizer. Agricultural wastes can also be used as raw material for compost production. In this case, a consortium of strains with cellulase, β-glucosidase, xylanase, and pectinase activities could be very useful to expedite the degradation of the material. Beyond their utilization in waste treatment, urease-producing organisms can be used to sequester heavy metals in soils, in a process called biomineralization [418]. Strain BBC|003, and even other fast positive ureases found in this work, could be studied for this type of application. No further studies were performed with esterase/lipase producers, even though some strains showed high growths using esters as sole carbon source, and in the degradation of tributyrin. Some of these strains can be useful for the treatment of oil contaminated sites (namely some of the strains isolated from petrochemical WWTP), or to help degrade wastes with high fat content. Bacteria that produce dextranases can be very useful to decrease the harmful effects of the proliferation of dextran-producing bacteria, including in preventing biofilm formation, but none of the strains in the working set showed potential be used for this purpose: for the three strains (BBC|056, BBC|148, and BBC|177), the dextranase was only produced in the absence of alternative carbon sources, an unlikely scenario in wastewater treatment.

Except for BBC|027, which was only identified as an Enterobacteriaceae, the identification of the strains in the working set was truncated to the level of genus, due to the small length of the 16S rRNA gene sequence obtained. A more accurate identification should be achieved, possibly using other primers, that enable longer nucleotide sequences (ex., by using primer 1492R [279]); or, for some genera, like Bacillus, the correct identification may require the sequencing of other genes. This is of extreme importance for BBC|003 if the strain continues to show potential for wastewater treatment, but also for other strains, like BBC|027 (a strain which was found to be one of the best hydrolase producers for five out of 12 screened

75 activities), which was not considered for further testing beyond the hydrolase screening due to its poor identification. The strain BBC|003 is a coliform bacteria. Coliforms have been used as indicators of fecal contamination in water, and as sanitary indicator in food products, like milk, for nearly a century [419]. The definition for coliform bacteria has changed through the years. In the dairy industry were originally defined as “aerobic or facultatively anaerobic, Gram-negative, non-spore forming rods capable of fermenting lactose to produce gas and acid within 48 h at 32–35 °C” [419], and included the species in the genera Citrobacter, Enterobacter, Escherichia, and Klebsiella, though nowadays this definition includes 19 genera [419]. The International Organization for Standardization defines coliforms as members of the family Enterobacteriaceae that are able to express β-D-galactosidase [420]. Being a coliform bacteria may limit the BBC|003 use to enclosed environments, like WWTP, and solid waste treatment. Still there are several potential applications for a bacteria with the features that BBC|003 shows.

Finally, it is important to mention that this work presents an initial analysis of the potential of the BBC collection. A more thorough analysis can lead to the discovery of other desirable strains, some with even higher potential for biotechnological applications.

76 References 1 Morelli, J. (2011) Environmental sustainability: a definition for environmental professionals. J. Environ. Sustain. 1, Article 2. 2 Linthorst, J.A. (2010) An overview: origins and development of green chemistry. Found. Chem. 12, 55-68. 3 Anastas, P.T. and Eghbali, N. (2010) Green chemistry: principles and practice. Chem. Soc. Rev. 39, 301-312. 4 Tundo, P. et al. (2000) Synthetic pathways and processes in green chemistry. Introductory overview. Pure Appl. Chem. 72, 1207-1228. 5 Anastas, P.T. and Warner, J.C. (1998) Green Chemistry: Theory and Practice, Oxford Science Publications. 6 Anastas, P.T. et al. (2001) Catalysis as a foundational pillar of green chemistry. Appl. Catal. A Gen. 221, 3-13. 7 McNaught, D. and Wilkinson, A. (2014) Compendium of Chemical Terminology (IUPAC Gold Book), 2nd Ed. Blackwell Scientific Publications. 8 Nelson, D.L. and Cox, M.M. (2013) Lehninger. Principles of Biochemistry, 6th Ed. MacMillan. 9 Robinson, P.K. (2015) Enzymes: principles and biotechnological applications. Essays Biochem. 59, 1-41. 10 Walter, N.G. and Engelke, D.R. (2002) Ribozymes: catalytic RNAs that cut things, make things, and do odd and useful jobs. Biologist (London). 49, 199-203. 11 Nevinsky, G.A. et al. (2000) Natural catalytic antibodies (abzymes) in normalcy and pathology. Biochem. 65, 1473-1487. 12 Moran, L.A. et al. (2012) Principles of Biochemistry, 5th Ed. Pearson. 13 Sanchez, S. and Demain, A.L. (2017) Useful microbial enzymes - An introduction. In Biotechnology of Microbial Enzymes: Production, Biocatalysis and Industrial Applications (Brahmachari, G. et al., eds), pp. 1-11, Elsevier. 14 Ahuja, S.K. et al. (2004) Utilization of enzymes for environmental applications. Crit. Rev. Biotechnol. 24, 125-154. 15 Karigar, C.S. and Rao, S.S. (2011) Role of microbial enzymes in the bioremediation of pollutants: a review. Enzyme Res. Vol 2011, Article 805187. 16 Li, S. et al. (2012) Technology prospecting on enzymes: application, marketing and engineering. Comput. Struct. Biotechnol. J. 2, e201209017. 17 Busto, E. et al. (2010) Hydrolases: catalytically promiscuous enzymes for non-conventional reactions in organic synthesis. Chem. Soc. Rev. 39, 4504-4523. 18 Palmer, T. and Bonner, P.L. (2007) Enzymes: Biochemistry, Biotechnology and Clinical Chemistry, 2nd Ed. Woodhead Publishing. 19 Dalmaso, G.Z.L. et al. (2015) Marine extremophiles a source of hydrolases for biotechnological

77 applications. Mar. Drugs 13, 1925-1965. 20 Dewan, S.S. (2017) Global markets for enzymes in industrial applications, BCC Research. 21 Tipton, K. and Boyce, S. (2000) History of the enzyme nomenclature system. Bioinformatics 16, 34-40. 22 McDonald, A.G. and Tipton, K.F. (2014) Fifty-five years of enzyme classification: advances and difficulties. FEBS J. 281, 583-592. 23 McDonald, A.G. et al. (2009) ExplorEnz: the primary source of the IUBMB enzyme list. Nucleic Acids Res. 37, D593-D597. 24 Chen, Y. et al. (2016) Carboxylic ester hydrolases: classification and database derived from their primary, secondary, and tertiary structures. Protein Sci. 25, 1942-1953. 25 Rawlings, N.D. and Barrett, A.J. (1993) Evolutionary families of peptidases. Biochem. J. 290, 205-218. 26 Rawlings, N.D. et al. (2018) The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res. 46, D624-D632. 27 Rawlings, N.D. et al. (2004) Evolutionary families of peptidase inhibitors. Biochem. J. 378, 705-716. 28 Rawlings, N.D. et al. (2014) MEROPS: the database or proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 42, D503-D509. 29 Rawlings, N.D. et al. (2016) Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 44, D343-D350. 30 Lombard, V. et al. (2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490-D495. 31 Henrissat, B. (1991) A classification of glycosyl hydrolases based on amino acid sequence similarities. Biochem. J. 280, 309-316. 32 Berman, H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res. 28, 235-242. 33 Chahinian, H. and Sarda, L. (2009) Distinction between esterases and lipases: comparative biochemical properties of sequence-related carboxylesterases. Protein Pept. Lett. 16, 1149-61. 34 Sood, S. et al. (2016) Microbial carboxylesterases: an insight into thermal adaptation using in silico approach. J. Proteomics Bioinforma. 9, 131-136. 35 Jaeger, K. et al. (1994) Bacterial lipases. FEMS Microbiol. Rev. 15, 29-63. 36 Alvarez-Macarie, E. and Baratti, J. (2000) Short chain flavour ester synthesis by a new esterase from Bacillus licheniformis. J. Mol. Catal. B Enzym. 10, 377-383. 37 Elleuche, S. et al. (2014) Extremozymes - biocatalysts with unique properties from extremophilic microorganisms. Curr. Opin. Biotechnol. 29, 116-123. 38 Fojan, P. et al. (2000) What distinguishes an esterase from a lipase: a novel structural approach. Biochimie 82, 1033-1041. 39 Verger, R. (1997) 'Interfacial activation’ of lipases: facts and artifacts. Trends Biotechnol. 15, 32-38. 40 Brocca, S. et al. (2003) Sequence of the lid affects activity and specificity of Candida rugosa lipase isoenzymes. Protein Sci. 12, 2312-2319. 41 Bornscheuer, U.T. (2002) Microbial carboxyl esterases: classification, properties and application in

78 biocatalysis. FEMS Microbiol. Rev. 26, 73-81. 42 Bhagavan, N. V. (2001) Medical Biochemistry, 4th Ed. Harcourt/Academic Press. 43 Cantarel, B.L. et al. (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res. 37, D233-D238. 44 Kötzler, M.P. et al. (2014) Glycosidases: functions, families and folds. In Encyclopedia of Life Sciences, pp. 1-14, John Wiley & Sons. 45 Synytsya, A. and Novak, M. (2014) Structural analysis of glucans. Ann. Transl. Med. 2, 17. 46 Zeeman, S.C. et al. (2010) Starch: its metabolism, evolution, and biotechnological modification in plants. Annu. Rev. Plant Biol. 61, 209-234. 47 Hii, S.L. et al. (2012) Pullulanase: role in starch hydrolysis and potential industrial applications. Enzyme Res. Vol 2012, Article 921362. 48 Wayne, R. (2009) Plant Cell Biology - From Astronomy to Zoology, Academic Press. 49 Himmel, M.E. et al. (2007) Cellulases, hemicellulases, and pectinases. In Methods for General and Molecular Microbiology 3rd Ed. (Reddy, C.A. et al., eds), pp. 596-610, ACS Press. 50 Martone, P.T. et al. (2009) Discovery of lignin in seaweed reveals convergent evolution of cell-wall architecture. Curr. Biol. 19, 169-175. 51 Ryan, S.M. et al. (2006) Screening for and identification of starch-, amylopectin-, and pullulan- degrading activities in bifidobacterial strains. Appl. Environ. Microbiol. 72, 5289-5296. 52 Bairoch, A. (2000) The ENZYME database in 2000. Nucleic Acids Res. 28, 304-305. 53 Evert, R.F. and Eichhorn, S.E. (2013) Raven. Biology of Plants, 8th Ed. W. H. Freeman and Company. 54 Bhat, M.K. (2000) Cellulases and related enzymes in biotechnology. Biotechnol. Adv. 18, 355-383. 55 Sukumaran, R.K. et al. (2005) Microbial cellulases - production, applications and challenges. J. Sci. Ind. Res. (India). 64, 832-844. 56 Insam, H. et al. (2010) Microbes at Work - From Wastes to Resources, Springer. 57 Horn, S.J. et al. (2012) Novel enzymes for the degradation of cellulose. Biotechnol. Biofuels 5, 1-12. 58 Scheller, H.V. and Ulvskov, P. (2010) Hemicelluloses. Annu. Rev. Plant Biol. 61, 263-289. 59 Beg, Q.K. et al. (2001) Microbial xylanases and their industrial applications: a review. Appl. Microbiol. Biotechnol. 56, 326-338. 60 Shallom, D. and Shoham, Y. (2003) Microbial hemicellulases. Curr. Opin. Microbiol. 6, 219-228. 61 Viikari, L. et al. (2009) Forest products: biotechnology in pulp and paper processing. In Encyclopedia of Microbiology 3rd Ed. (Schaechter, M., ed), pp. 80-94, Elsevier. 62 Priya, V. and Sashi, V. (2014) Pectinase enzyme producing microorganisms. Int. J. Sci. Res. Publ. 4, 1-3. 63 Pedrolli, D.B. et al. (2009) Pectin and pectinases: production, characterization and industrial application of microbial pectinolytic enzymes. Open Biotechnol. J. 3, 9-18. 64 Saxena, S. (2015) Applied Microbiology, Springer. 65 Singh, R. et al. (2016) Microbial enzymes: industrial progress in 21st century. 3 Biotech 6, 174. 66 Garg, G. et al. (2016) Microbial pectinases: an ecofriendly tool of nature for industries. 3 Biotech 6, 47.

79 67 Geetha, M. et al. (2012) Screening of pectinase producing bacteria and fungi for its pectinolytic activity using fruit wastes. Int. J. Biochem. Biotech Sci. 1, 30-42. 68 Gargulak, J.D. et al. (2015) Lignin. In Kirk-Othmer Encyclopedia of Chemical Technology, pp. 1-26, John Wiley & Sons. 69 Pollegioni, L. et al. (2015) Lignin-degrading enzymes. FEBS J. 282, 1190-1213. 70 Rasmussen, R.S. and Morrissey, M.T. (2007) Marine biotechnology for production of food ingredients. Adv. Food Nutr. Res. 52, 237-292. 71 Chi, W.-J. et al. (2012) Agar degradation by microorganisms and agar-degrading enzymes. Appl. Microbiol. Biotechnol. 94, 917-930. 72 Phillips, G.O. and Williams, P.A., eds. (2009) Handbook of hydrocolloids, 2nd Ed. Woodhead Publishing. 73 Yun, E.J. et al. (2015) Red macroalgae as a sustainable resource for bio-based products. Trends Biotechnol. 33, 247-249. 74 Fu, X.T. and Kim, S.M. (2010) Agarase: review of major sources, categories, purification method, enzyme characteristics and applications. Mar. Drugs 8, 200-218. 75 Jahromi, S.T. and Barzkar, N. (2018) Future direction in marine bacterial agarases for industrial applications. Appl. Microbiol. Biotechnol. 102, 6847-6863. 76 Early, R. (2012) Dairy products and milk-based food ingredients. In Natural Food Additives, Ingredients and Flavourings (Baines, D. and Seal, R., eds), pp. 417-445, Woodhead Publishing. 77 Liao, H. et al. (2015) Functional diversity and properties of multiple xylanases from Penicillium oxalicum GZ-2. Sci. Rep. 5, 12631. 78 Bacic, A. et al. (2009) Chemistry, Biochemistry and Biology of (1→3)-β-Glucans and Related Polysaccharides, Academic Press. 79 Dickson, R.C. et al. (1979) Purification and properties of an inducible β-galactosidase isolated from the yeast Kluyveromyces lactis. J. Bacteriol. 137, 51-61. 80 Liu, G.X. et al. (2011) β-Galactosidase with transgalactosylation activity from Lactobacillus fermentum K4. J. Dairy Sci. 94, 5811-5820. 81 Khalikova, E. et al. (2005) Microbial dextran-hydrolyzing enzymes: fundamentals and applications. Microbiol. Mol. Biol. Rev. 69, 306-325. 82 Bhavani, A.L. and Nisha, J. (2010) Dextran - the polysaccharide with versatile uses. Int. J. Pharma Bio Sci. 1, 569-573. 83 Rao, M.B. et al. (1998) Molecular and biotechnological aspects of microbial proteases. Microbiol. Mol. Biol. Rev. 62, 597-635. 84 Tkacz, J.S. and Lange, L. (2004) Advances in Fungal Biotechnology for Industry, Agriculture, and Medicine, Springer. 85 Sujoy, B. and Aparna, A. (2013) Enzymology, immobilization and applications of urease enzyme. Int. Res. J. Biol. Sci. 2, 51-56. 86 Karplus, P.A. et al. (1997) 70 years of crystalline urease: what have we learned? Acc. Chem. Res.

80 30, 330-337. 87 Simoni, R.D. et al. (2002) Urease, the first crystalline enzyme and the proof that enzymes are proteins: the work of James B. Sumner. J. Biol. Chem. 277, e1-e2. 88 Dixon, N.E. et al. (1975) Jack bean urease (EC 3.5.1.5). A metalloenzyme. A simple biological role for nickel? J. Am. Chem. Soc. 97, 4131-4133. 89 Lv, J. et al. (2011) Structural and functional role of nickel ions in urease by molecular dynamics simulation. J. Biol. Inorg. Chem. 16, 125-135. 90 Panda, T. and Gowrishankar, B.S. (2005) Production and applications of esterases. Appl. Microbiol. Biotechnol. 67, 160-169. 91 Tapin, S. et al. (2006) Feruloyl esterase utilization for simultaneous processing of nonwood plants into phenolic compounds and pulp fibers. J. Agric. Food Chem. 54, 3697-3703. 92 Kang, C.-H. et al. (2011) A novel family VII esterase with industrial potential from compost metagenomic library. Microb. Cell Fact. 10, 1-8 93 Gangola, S. et al. (2018) Presence of esterase and laccase in Bacillus subtilis facilitates biodegradation and detoxification of cypermethrin. Sci. Rep. 8, 12755. 94 Madigan, M.T. et al. (2015) Brock. Biology of Microorganisms, 14th Ed. Pearson. 95 Brault, G. et al. (2012) Isolation and characterization of EstC, a new cold-active esterase from Streptomyces coelicolor A3(2). PLoS One 7, e32041. 96 Kim, S.-B. et al. (2008) Cloning and characterization of thermostable esterase from Archaeoglobus fulgidus. J. Microbiol. 46, 100-107. 97 Montoro-García, S. et al. (2009) Characterization of a novel thermostable carboxylesterase from Geobacillus kaustophilus HTA426 shows the existence of a new carboxylesterase family. J. Bacteriol. 191, 3076-3085. 98 Zhang, X.-Y. et al. (2014) Newly identified thermostable esterase from Sulfobacillus acidophilus: properties and performance in phthalate ester degradation. Appl. Environ. Microbiol. 80, 6870–6878. 99 Sriyapai, P. et al. (2015) Cloning, expression and characterization of a thermostable esterase HydS14 from Actinomadura sp.Strain S14 in Pichia pastoris. Int. J. Mol. Sci. 16, 13579-13594. 100 Gurung, N. et al. (2013) A broader view: microbial enzymes and their relevance in industries, medicine, and beyond. Biomed. Res. Int. Vol 2013, Article 329121. 101 Gross, R.A. et al. (2001) Polymer synthesis by in vitro enzyme catalysis. Chem. Rev. 101, 2097-124. 102 Okino-Delgado, C.H. et al. (2017) Bioremediation of cooking oil waste using lipases from wastes. PLoS One 12, e0186246. 103 Joseph, B. et al. (2008) Cold active microbial lipases: some hot issues and recent developments. Biotechnol. Adv. 26, 457-470. 104 Yasuda, T. et al. (2005) Clinical applications of DNase I, a genetic marker already used for forensic identification. Leg. Med. 7, 274-277. 105 de Souza, P.M. and Magalhães, P. de O. (2010) Application of microbial α-amylase in industry - a review. Brazilian J. Microbiol. 41, 850-861.

81 106 Saini, R. et al. (2017) Amylases: characteristics and industrial applications. J. Pharmacogn. Phytochem. 6, 1865-1871. 107 Adrio, J.L. and Demain, A.L. (2014) Microbial enzymes: tools for biotechnological processes. Biomolecules 4, 117-139. 108 Janarthanan, R. et al. (2014) Bioremediation of vegetable wastes through biomanuring and enzyme production. Int. J. Curr. Microbiol. Appl. Sci. 3, 89-100. 109 Del Valle, E.M.M. (2004) Cyclodextrins and their uses: a review. Process Biochem. 39, 1033-1046. 110 Ray, R.R. (2011) Microbial isoamylases: an overview. Am. J. Food Technol. 6, 1-18. 111 Karam, J. and Nicell, J.A. (1997) Potential applications of enzymes in waste treatment. J. Chem. Technol. Biotechnol. 69, 141-153. 112 Kirk, O. et al. (2002) Industrial enzyme applications. Curr. Opin. Biotechnol. 13, 345-351. 113 Karmakar, M. and Ray, R.R. (2011) Current trends in research and application of microbial cellulases. Res. J. Microbiol. 6, 41-53. 114 Isikgor, F.H. and Becer, C.R. (2015) Lignocellulosic biomass: a sustainable platform for the production of bio-based chemicals and polymers. Polym. Chem. 6, 4497-4559. 115 Singh, G. et al. (2016) Catalytic properties, functional attributes and industrial applications of β-glucosidases. 3 Biotech 6, 1-14. 116 Obruca, S. et al. (2015) Use of lignocellulosic materials for PHA Production. Chem. Biochem. Eng. Q. 29, 135-144. 117 Brahmachari, G. et al. (2017) Biotechnology of Microbial Enzymes, Academic Press. 118 Poutanen, K. et al. (1990) Deacetylation of xylans by acetyl esterases of Trichoderma reesei. Appl. Microbiol. Biotechnol. 33, 506-510. 119 Alalouf, O. et al. (2011) A new family of carbohydrate esterases is represented by a GDSL hydrolase/acetylxylan esterase from Geobacillus stearothermophilus. J. Biol. Chem. 286, 41993-42001. 120 Bajpai, P. (2014) Microbial xylanolytic systems and their properties. In Xylanolytic Enzymes, pp. 19-36, Elsevier. 121 Jahan, N. et al. (2017) Utilization of agro waste pectin for the production of industrially important polygalacturonase. Heliyon 3, e00330. 122 Harju, M. et al. (2012) Lactose hydrolysis and other conversions in dairy products: technological aspects. Int. Dairy J. 22, 104-109. 123 Saqib, S. et al. (2017) Sources of β-galactosidase and its applications in food industry. 3 Biotech 7, 79. 124 Nath, A. et al. (2014) Production, purification, characterization, immobilization, and application of β-galactosidase: a review. Asia-Pacific J. Chem. Eng. 9, 330–348. 125 Mlichová, Z. and Rosenberg, M. (2006) Current trends of β-galactosidase application in food technology. J. Food Nutr. Res. 45, 47-54. 126 Jiménez, E.R. (2009) Dextranase in sugar industry: a review. Sugar Tech 11, 124-134. 127 Gibbons, R.J. and Banghart, S.B. (1967) Synthesis of extracellular dextran by cariogenic bacteria and its presence in human dental plaque. Arch. Oral Biol. 12, 11-24.

82 128 Gibbons, R.J. and Keyes, P.H. (1969) Inhibition of insoluble dextran synthesis, plaque formation and dental caries in hamsters by low molecular weight dextran. Arch. Oral Biol. 14, 721-724. 129 Jiao, Y.-L. et al. (2014) Characterization of a marine-derived dextranase and its application to the prevention of dental caries. J. Ind. Microbiol. Biotechnol. 41, 17-26. 130 Singh, R. et al. (2016) Microbial proteases in commercial applications. J. Pharm. Chem. Biol. Sci. 4, 365-374. 131 Gupta, R. et al. (2002) Bacterial alkaline proteases: molecular approaches and industrial applications. Appl. Microbiol. Biotechnol. 59, 15-32. 132 Verhoeckx, K.C.M. et al. (2015) Food processing and allergenicity. Food Chem. Toxicol. 80, 223-240. 133 Kasana, R.C. (2010) Proteases from psychrotrophs: an overview. Crit. Rev. Microbiol. 36, 134-145. 134 Miyagawa, K. et al. (1999) Purification, characterization, and application of an acid urease from Arthrobacter mobilis. J. Biotechnol. 68, 227-236. 135 Qin, Y. and Cabral, J.M.S. (2002) Properties and applications of urease. Biocatal. Biotransformation 20, 1-14. 136 Yang, L.-Q. et al. (2010) Purification, properties, and application of a novel acid urease from Enterobacter sp. Appl. Biochem. Biotechnol. 160, 303-313. 137 Latkar, M. and Chakrabarti, T. (1994) Performance of upflow anaerobic sludge blanket reactor carrying out biological hydrolysis of urea. Water Environ. Res. 66, 12-15. 138 Bremner, J.M. (1995) Recent research on problems in the use of urea as a nitrogen fertilizer. Fertil. Res. 42, 321-329. 139 Alves, P.D.D. et al. (2014) Survey of microbial enzymes in soil, water, and plant microenvironments. Open Microbiol. J. 8, 25-31. 140 Sneha, S. et al. (2014) Isolation and screening of protease producing bacteria from marine waste. J. Chem. Pharm. Res. 6, 1157-1159. 141 Rupali, D. (2015) Screening and isolation of protease producing bacteria from soil collected from different areas of Burhanpur Region (MP) India. Int. J. Curr. Microbiol. Appl. Sci. 4, 597-606. 142 Desai, S.S. et al. (2010) Isolation of keratinase from bacterial isolates of poultry soil for waste degradation. Eng. Life Sci. 10, 361-367. 143 Prasad, H. V. et al. (2010) Screening of extracellular keratinase producing bacteria from feather processing areas in Vellore, Tamil Nadu, India. J. Sci. Res. 2, 559-565. 144 Raju, E.V.N. and Divakar, G. (2013) Screening and isolation of keratinase producing bacteria from poultry waste. Int. J. Pharm. Res. Allied Sci. 24, 70-74. 145 Savita, K. and Archana, P. (2014) Screening of keratinase producers from leather. J. Environ. Res. Dev. 8, 639-644. 146 Maki, M.L. et al. (2011) Characterization of some efficient cellulase producing bacteria isolated from paper mill sludges and organic fertilizers. Int. J. Biochem. Mol. Biol. 2, 146-154. 147 Gupta, P. et al. (2012) Isolation of cellulose-degrading bacteria and determination of their cellulolytic potential. Int. J. Microbiol. Vol 2012, Article 578925.

83 148 Rasul, F. et al. (2015) Screening and characterization of cellulase producing bacteria from soil and waste (molasses) of sugar industry. Int. J. Biosci. 6, 230-238. 149 Hitha, P.K. and Girija, D. (2014) Isolation and screening of native microbial isolates for pectinase activity. Int. J. Sci. Res. 3, 632-634. 150 Kaur, A. et al. (2012) Isolation, characterization and identification of bacterial strain producing amylase. J. Microbiol. Biotechnol. Res. 2, 573-579. 151 Johnson, F.S. et al. (2014) Amylase production by fungi isolated from Cassava processing site. J. Microbiol. Biotechnol. Res. 4, 23-30. 152 Zhang, C. and Kim, S.-K. (2010) Research and application of marine microbial enzymes: status and prospects. Mar. Drugs 8, 1920-1934. 153 Bramucci, M. and Nagarajan, V. (2006) Bacterial communities in industrial wastewater bioreactors. Curr. Opin. Microbiol. 9, 275-278. 154 Geetha, S.J. et al. (2013) Isolation and characterization of hydrocarbon degrading bacterial isolate from oil contaminated sites. APCBEE Procedia 5, 237-241. 155 Udosen, I.J. and Okon, O.G. (2014) Microbial load and enzyme activities of microorganisms isolated from waste oil contaminated soil in Akwa Ibom State, Nigeria. Int. J. Res. 1, 405-414. 156 Goddard, J.-P. and Reymond, J.-L. (2004) Enzyme assays for high-throughput screening. Curr. Opin. Biotechnol. 15, 314-322. 157 Goddard, J.-P. and Reymond, J.-L. (2004) Enzyme activity fingerprinting with substrate cocktails. J. Am. Chem. Soc. 126, 11116-11117. 158 Houfani, A.A. et al. (2017) Efficient screening of potential cellulases and hemicellulases produced by Bosea sp. FBZP-16 using the combination of enzyme assays and genome analysis. World J. Microbiol. Biotechnol. 33, 1-14. 159 Schlüter, H. et al. (2003) Detection of protease activities with the mass-spectrometry-assisted enzyme-screening (MES) system. Anal. Bioanal. Chem. 377, 1102-1107. 160 Yazbeck, D.R. et al. (2003) Automated enzyme screening methods for the preparation of enantiopure pharmaceutical intermediates. Adv. Synth. Catal. 345, 524-532. 161 Longwell, C.K. et al. (2017) High-throughput screening technologies for enzyme engineering. Curr. Opin. Biotechnol. 48, 196-202. 162 Cravatt, B.F. et al. (2008) Activity-based protein profiling: from enzyme chemistry to proteomic chemistry. Annu. Rev. Biochem. 77, 383-414. 163 Zweerink, S. et al. (2017) Activity-based protein profiling as a robust method for enzyme identification and screening in extremophilic Archaea. Nat. Commun. 8, 1-12. 164 Gibbons, S.M. and Gilbert, J.A. (2015) Microbial diversity - exploration of natural ecosystems and microbiomes. Curr. Opin. Genet. Dev. 35, 66-72. 165 Hug, L.A. et al. (2016) A new view of the tree of life. Nat. Microbiol. 1, Article 16048. 166 van den Burg, B. (2003) Extremophiles as a source for novel enzymes. Curr. Opin. Microbiol. 6, 213-218.

84 167 Sewalt, V. et al. (2016) The generally recognized as safe (GRAS) process for industrial microbial enzymes. Ind. Biotechnol. 12, 295-302. 168 Cherry, J.R. and Fidantsef, A.L. (2003) Directed evolution of industrial enzymes: an update. Curr. Opin. Biotechnol. 14, 438-443. 169 Ebert, M.C.C.J.C. and Pelletier, J.N. (2017) Computational tools for the enzyme improvement: why everyone can - and should - use them. Curr. Opin. Chem. Biol. 37, 89-96. 170 Feng, X. et al. (2012) Enhancing the efficiency of directed evolution in focused enzyme libraries by the adaptive substituent reordering algorithm. Chem. - A Eur. J. 18, 5646-5654. 171 Damborsky, J. and Brezovsky, J. (2014) Computational tools for designing and engineering enzymes. Curr. Opin. Chem. Biol. 19, 8-16. 172 Bornscheuer, U.T. et al. (2012) Engineering the third wave of biocatalysis. Nature 485, 185-194. 173 Murray, P. et al. (2013) Medical Microbiology, 7th Ed. Elsevier. 174 Buchholz, K. et al. (2012) Biocatalysts and enzyme technology, 2nd Ed. Wiley-Blackwell. 175 Stackebrandt, E. and Goebel, B.M. (1994) Taxonomic Note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. Int. J. Syst. Bacteriol. 44, 846-849. 176 Patel, J.B. (2001) 16S rRNA gene sequencing for bacterial pathogen identification in the clinical laboratory. Mol. Diagnosis 6, 313-321. 177 Janda, J.M. and Abbott, S.L. (2007) 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: Pluses, perils, and pitfalls. J. Clin. Microbiol. 45, 2761-2764. 178 Srinivasan, R. et al. (2015) Use of 16S rRNA gene for identification of a broad range of clinically relevant bacterial pathogens. PLoS One 10, e0117617. 179 Woese, C.R. (1987) Bacterial evolution. Microbiol. Rev. 51, 221-271. 180 Woese, C.R. et al. (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. PNAS 87, 4576-4579. 181 Stackebrandt, E. and Ebers, J. (2006) Taxonomic parameters revisited: tarnished gold standards. Microbiol. Today 33, 152-155. 182 Wayne, L.G. et al. (1987) Report of the Ad Hoc Committee on reconciliation of approaches to bacterial systematics. Int. J. Syst. Bacteriol. 37, 463-464. 183 Beye, M. et al. (2018) Careful use of 16S rRNA gene sequence similarity values for the identification of Mycobacterium species. New Microbes New Infect. 22, 24-29. 184 Rossi-Tamisier, M. et al. (2015) Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species. Int. J. Syst. Evol. Microbiol. 65, 1929-1934. 185 Kim, M. et al. (2014) Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes. Int. J. Syst. Evol. Microbiol. 64, 346-351. 186 Porwal, S. et al. (2009) Phylogeny in aid of the present and novel microbial lineages: diversity in Bacillus. PLoS One 4, e4438.

85 187 Das, S. et al. (2014) Understanding molecular identification and polyphasic taxonomic approaches for genetic relatedness and phylogenetic relationships of microorganisms. J. Microbiol. Methods 103, 80-100. 188 Ciccarelli, F.D. et al. (2006) Toward automatic reconstruction of a highly resolved tree of life. Science 311, 1283-1287. 189 Creevey, C.J. et al. (2011) Universally distributed single-copy genes indicate a constant rate of horizontal transfer. PLoS One 6, e22099. 190 Mende, D.R. et al. (2013) Accurate and universal delineation of prokaryotic species. Nat. Methods 10, 881-884. 191 Anastas, P.T. et al. (2001) Green Engineering: Introduction. In Green Engineering (Anastas, P.T. et al., eds), pp. 1-5, American Chemical Society. 192 Tomei, M.C. and Daugulis, A.J. (2013) Ex situ bioremediation of contaminated soils: an overview of conventional and innovative technologies. Crit. Rev. Environ. Sci. Technol. 43, 2107-2139. 193 Crawford, R.L. and Crawford, D.L., eds. (1996) Bioremediation: Principles and Applications, Cambridge University Press. 194 Adams, G.O. et al. (2015) Bioremediation, biostimulation and bioaugmention: a review. Int. J. Environ. Bioremediation Biodegrad. 3, 28-39. 195 Nzila, A. et al. (2016) Bioaugmentation: an emerging strategy of industrial wastewater treatment for reuse and discharge. Int. J. Environ. Res. Public Health 13, 1-20. 196 Hamer, G. (1997) Microbial consortia for multiple pollutant biodegradation. Pure Appl. Chem. 69, 2343-2356. 197 Zhang, S. et al. (2018) Interkingdom microbial consortia mechanisms to guide biotechnological applications. Microb. Biotechnol. 11, 833-847. 198 Lindemann, S.R. et al. (2016) Engineering microbial consortia for controllable outputs. ISME J. 10, 2077-2084. 199 Baez-Rogelio, A. et al. (2017) Next generation of microbial inoculants for agriculture and bioremediation. Microb. Biotechnol. 10, 19-21. 200 Woo, S.L. and Pepe, O. (2018) Microbial consortia: promising probiotics as plant biostimulants for sustainable agriculture. Front. Plant Sci. 9, Article 1801. 201 Bradáčová, K. et al. (2019) Microbial consortia versus single-strain inoculants: an advantage in PGPM-assisted tomato production? Agronomy 9, Article 105. 202 Zhang, B. et al. (2017) Structure and function of the microbial consortia of activated sludge in typical municipal wastewater treatment plants in winter. Sci. Rep. 7, Article 17930. 203 Alami, N.H. et al. (2014) The influence of microbial consortium in bioremediation process using bioreactor. IPTEK J. Sci. 1, 1-4. 204 Shankar, S. et al. (2014) Application of indigenous microbial consortia in bioremediation of oil-contaminated soils. Int. J. Environ. Sci. Technol. 11, 367-376. 205 Asadirad, M.H.A. et al. (2016) Effects of indigenous microbial consortium in crude oil degradation: A

86 microcosm experiment. Int. J. Environ. Res. 10, 491-498. 206 Patowary, K. et al. (2016) Development of an efficient bacterial consortium for the potential remediation of hydrocarbons from contaminated sites. Front. Microbiol. 7, Article 1092. 207 Lee, Y. et al. (2018) Construction and evaluation of a Korean native microbial consortium for the bioremediation of diesel fuel-contaminated soil in Korea. Front. Microbiol. 9, Article 2594. 208 Villaverde, J. et al. (2018) Combined use of microbial consortia isolated from different agricultural soils and cyclodextrin as a bioremediation technique for herbicide contaminated soils. Chemosphere 193, 118-125. 209 Blanco-Enríquez, E.G. et al. (2018) Characterization of a microbial consortium for the bioremoval of polycyclic aromatic hydrocarbons (PAHs) in water. Int. J. Environ. Res. Public Health 15, Article 975. 210 de Lima Brossi, M.J. et al. (2016) Soil-derived microbial consortia enriched with different plant biomass revel distinct players acting in lignocellulose degradation. Microb. Ecol. 71, 616-627. 211 Cortes-Tolalpa, L. et al. (2018) Halotolerant microbial consortia able to degrade highly recalcitrant plant biomass substrate. Appl. Microbiol. Biotechnol. 102, 2913-2927. 212 Muralikrishna, I. V. and Manickam, V. (2017) Wastewater Treatment Technologies. In Environmental Management. Science and Engineering for Industry, pp. 249-293, Butterworth-Heinemann. 213 Brandes, M. (1978) Characteristics effluents from ray and black water septic tanks. J. (Water Pollut. Control Fed.) 50, 2547-2559. 214 WWAP (United Nations World Water Assessment Programme) (2017) The United Nations World Water Development Report 2017. Wastewater: The Untapped Resource, UNESCO. 215 von Sperling, M. (2007) Wastewater Characteristics, Treatment and Disposal, IWA Publishing. 216 Raunkjær, K. et al. (1994) Measurement of pools of protein, carbohydrate and lipid in domestic wastewater. Water Res. 28, 251-262. 217 Masters, G.M. and Ela, W.P. (2014) Introduction to Environmental Engineering and Science, 3rd Ed. Pearson. 218 Jouanneau, S. et al. (2014) Methods for assessing biochemical oxygen demand (BOD): a review. Water Res. 49, 62-82. 219 Henze, M. et al. (2008) Biological Wastewater Treatment - Principles, Modelling and Design, IWA Publishing. 220 Rose, C. et al. (2015) The characterization of feces and urine: a review of the literature to inform advanced treatment technology. Crit. Rev. Environ. Sci. Technol. 45, 1827-1879. 221 Verachtert, H. et al. (1982) Investigations on cellulose biodegradation in activated sludge plants. J. Appl. Bacteriol. 52, 185-190. 222 Gupta, M. et al. (2018) Experimental assessment and validation of quantification methods for cellulose content in municipal wastewater and sludge. Environ. Sci. Pollut. Res. 25, 16743-16753. 223 Hurwitz, E. et al. (1961) Degradation of cellulose by activated sludge treatment. J. (Water Pollut. Control Fed.) 33, 1070-1075. 224 Bajpai, P. (2015) Management of pulp and paper mill waste, Springer.

87 225 Teixeira, P. et al. (2019) Integrated selection and identification of bacteria from polluted sites for biodegradation of lipids. Submitted to Int. Microbiol. Under final revision. 226 Huey, B. and Hall, J. (1989) Hypervariable DNA fingerprinting in Escherichia coli: minisatellite probe from bacteriophage M13. J. Bacteriol. 171, 2528-2532. 227 Vassart, G. et al. (1987) A sequence in M13 phage detects hypervariable minisatellites in human and animal DNA. Science (80-. ). 235, 683-684. 228 Ryskov, A.P. et al. (1988) M13 phage DNA as a universal marker for DNA fingerprinting of animals, plants and microorganisms. FEBS Lett. 233, 388-392. 229 Meyer, W. et al. (1991) Differentiation of species and strains among filamentous fungi by DNA fingerprinting. Curr. Genet. 19, 239-242. 230 Meyer, W. et al. (1993) Hybridization probes for conventional DNA fingerprinting used as single primers in the polymerase chain reaction to distinguish strains of Cryptococcus neoformans. J. Clin. Microbiol. 31, 2274-2280. 231 Meyer, W. and Mitchell, T.G. (1995) Polymerase chain reaction fingerprinting in fungi using single primers specific to minisatellites and simple repetitive DNA sequences: strain variation in Cryptococcus neoformans. Electrophoresis 16, 1648-1656. 232 Fladung, M. and Ziegenhagen, B. (1998) M13 DNA fingerprinting can be used in studies on phenotypic reversions of forest tree mutants. Trees 12, 310-314. 233 Edwards, U. et al. (1989) Isolation and direct complete nucleotide determination of entire genes. Characterization of a gene coding for 16S ribosomal RNA. Nucleic Acids Res. 17, 7843-7853. 234 Massol-Deya, A.A. et al. (1995) Bacterial community fingerprinting of amplified 16S and 16-23S ribossomal DNA gene sequences and restriction endonuclease analysis (ARDRA). In Molecular Microbial Ecology Manual 3.3.2 (Akkermans, A.D.L. et al., eds), pp. 1-8, Springer. 235 Chambel, L. et al. (2007) Occurrence and persistence of Listeria spp. in the environment of ewe and cow’s milk cheese dairies in Portugal unveiled by an integrated analysis of identification, typing and spatial-temporal mapping along production cycle. Int. J. Food Microbiol. 116, 52-63. 236 Navidi, W. (2011) Statistics for Engineers and Scientists, 3rd Ed. McGraw Hill. 237 Sokal, R.R. and Michener, C.D. (1958) A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 38, 1409-1438. 238 Coico, R. (2005) Gram Staining. Curr. Protoc. Microbiol. 00, A.3C.1-A.3C.2. 239 Buck, J.D. (1982) Nonstaining (KOH) method for determination of Gram reactions of marine bacteria. Appl. Environ. Microbiol. 44, 992-993. 240 Tindall, B.J. et al. (2007) Phenotypic characterization and the principles of comparative systematics. In Methods for General and Molecular Microbiology 3rd Ed. (Reddy, C.A. et al., eds), pp. 330-393, ASM Press. 241 Barrow, G.I. and Feltham, R.K.A. (2004) Cowan and Steel’s Manual for the Identification of Medical Bacteria, 3rd Ed. Cambridge University Press. 242 Sambrook, J. and Russell, D.W. (2001) Molecular Cloning - A laboratory Manual Vol. 1, 3rd Ed. Cold

88 Spring Harbor Laboratory Pres. 243 Hall, B.G. et al. (2014) Growth rates made easy. Mol. Biol. Evol. 31, 232-238. 244 Atkinson, K.E. (1989) An Introduction to Numerical Analysis, 2nd Ed. John Wiley & Sons. 245 Jolliffe, I.T. (2002) Principal Component Analysis, 2nd Ed. Springer. 246 Pearson, K. (1901) On lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin Philos. Mag. J. Sci. 2, 559-572. 247 Hotelling, H. (1933) Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417-441. 248 Triola, M.M. and Triola, M.F. (2014) Biostatistics for the Biological and Health Sciences with Statdisk, Pearson. 249 Sneath, P.H.A. and Sokal, R.R. (1973) Numerical Taxonomy. The principles and practice of numerical classification, W. H. Freeman and Company. 250 Sokal, R.R. and Rohlf, F.J. (1962) The comparison of dendrograms by objective methods. Taxon 11, 33-40. 251 Sokal, R.R. and Sneath, P.H.A. (1963) Principles of Numerical Taxonomy, W. H. Freeman and Company. 252 Mantel, N. (1967) The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209-220. 253 Fryer, T.F. et al. (1967) Methods for isolation and enumeration of lipolytic organisms. J. Dairy Sci. 50, 477-484. 254 Lanka, S. and Latha, J.N.L. (2015) A short review on various screening methods to isolate potential lipase producers: lipases - the present and future enzymes of biotech industry. Int. J. Biol. Chem. 9, 207-219. 255 Jeffries, C.D. et al. (1957) Rapid method for determining the activity of microorganisms on nucleic acids. J. Bacteriol. 73, 590-591. 256 Smith, P.B. et al. (1969) Improved medium for detecting deoxyribonuclease-producing bacteria. Appl. Microbiol. 18, 991-993. 257 van der Meulen, H.J. et al. (1974) Isolation and characterization of Cytophaga flevensis sp.nov., a new agarolytic flexibacterium. Antonie Van Leeuwenhoek 40, 329-346. 258 Zohra, R.R. et al. (2013) Dextranase: hyper production of dextran degrading enzyme from newly isolated strain of Bacillus licheniformis. Carbohydr. Polym. 92, 2149-2153. 259 Kiran, T. et al. (2015) Industrially important hydrolytic enzyme diversity explored in stove ash bacterial isolates. Pak. J. Pharm. Sci. 28, 2035-2040. 260 Teather, R.M. and Wood, P.J. (1982) Use of Congo red-polysaccharide interactions in enumeration and characterization of cellulolytic bacteria from the bovine rumen. Appl. Environ. Microbiol. 43, 777-780. 261 Devi, M.C. and Kumar, M.S. (2012) Production, optimization and partial purification of cellulase by Aspergillus niger fermented with paper and timber sawmill industrial wastes. J. Microbiol. Biotechnol.

89 Res. 2, 120-128. 262 Liang, Y.-L. et al. (2014) Isolation, screening, and identification of cellulolytic bacteria from natural reserves in the subtropical region of China and optimization of cellulase production by Paenibacillus terrae ME27-1. Biomed Res. Int. Vol 2014, Article 512497. 263 Wood, T.M. and Bhat, K.M. (1988) Methods for measuring cellulase activities. Methods Enzymol. 160, 87-112. 264 Edberg, S.C. et al. (1976) Rapid spot test for the determination of esculin hydrolysis. J. Clin. Microbiol. 4, 180-184. 265 Kwon, K. et al. (1994) Detection of β-glucosidase activity in polyacrylamide gels with esculin as substrate. Appl. Environ. Microbiol. 60, 4584-4586. 266 Mattéotti, C. et al. (2011) New glucosidase activities identified by functional screening of a genomic DNA library from the gut microbiota of the termite Reticulitermes santonensis. Microbiol. Res. 166, 629-642. 267 Scheirlinck, T. et al. (1990) Cloning and expression of cellulase and xylanase genes in Lactobacillus plantarum. Appl. Microbiol. Biotechnol. 33, 534-541. 268 Samanta, A.K. et al. (2011) A simple and efficient diffusion technique for assay of endo B-1,4-xylanase activity. Brazilian J. Microbiol. 42, 1349-1353. 269 Colombo, P.M. and Rascio, N. (1977) Ruthenium red staining for electron microscopy of plant material. J. Ultrastruct. Res. 60, 135-139. 270 Gainvors, A. et al. (2000) Purification and characterization of acidic endo-polygalacturonase encoded by the PGL1-1 gene from Saccharomyces cerevisiae. FEMS Microbiol. Lett. 183, 131-135. 271 Awasthi, M.K. et al. (2018) Biodegradation of food waste using microbial cultures producing thermostable α-amylase and cellulase under different pH and temperature. Bioresour. Technol. 248, 160-170. 272 Djagny, K.B. et al. (2001) Gelatin: a valuable protein for food and pharmaceutical industries: Review. Crit. Rev. Food Sci. Nutr. 41, 481-492. 273 Alnahdi, H.S. (2012) Isolation and screening of extracellular proteases produced by new isolated Bacillus sp.J. Appl. Pharm. Sci. 2, 71-74. 274 Christensen, W.B. (1946) Urea decomposition as a means of differentiating Proteus and paracolon cultures from each other and from Salmonella and Shigella types. J. Bacteriol. 52, 461-466. 275 Pitcher, D.G. et al. (1989) Rapid extraction of bacterial genomic DNA with guanidium thiocyanate. Lett. Appl. Microbiol. 8, 151-156. 276 Sabnis, R.W. (2010) Handbook of Biological Dyes and Stains: Synthesis and Industrial Applications, John Wiley & Sons. 277 Turner, S. et al. (1999) Investigating deep phylogenetic relationships among cyanobacteria and plastids by small subunit rRNA sequence analysis. J. Eukaryot. Microbiol. 46, 327-338. 278 Hughes, M.S. et al. (2000) Identification by 16S rRNA gene analyses of a potential novel mycobacterial species as an etiological agent of canine leproid granuloma syndrome. J. Clin. Microbiol. 38, 953-959.

90 279 Lane, D.J. (1991) 16S/23S rRNA Sequencing. In Nucleic Acid Techniques in Bacterial Systematics (Stackebrandt, E. and Goodfellow, M., eds), pp. 115-175, John Wiley & Sons. 280 Amann, R.I. et al. (1995) Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59, 143-169. 281 Matsuki, T. et al. (2002) Development of 16S rRNA-gene-targeted group-specific primers for the detection and identifications of predominant bacteria in human feces. Appl. Environ. Microbiol. 68, 5445-5451. 282 Yang, B. et al. (2016) Sensitivity and correlation of hypervariable regions in 16S rRNA genes in phylogenetic analysis. BMC Bioinformatics 17, 1-8. 283 Schneider, C.A. et al. (2012) NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671-675. 284 Schindelin, J. et al. (2015) The ImageJ ecosystem: an open platform for biomedical image analysis. Mol. Reprod. Dev. 82, 518-529. 285 Wang, Q. et al. (2007) Naïve bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261-5267. 286 Yoon, S.-H. et al. (2017) Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int. J. Syst. Evol. Microbiol. 67, 1613-1617. 287 Altschul, S.F. et al. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403-410. 288 Thompson, J.D. et al. (1994) CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. 289 Kumar, S. et al. (2018) MEGA X: Molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547-1549. 290 Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406-425. 291 Edwards, A.W.F. and Cavalli-Sforza, L.L. (1963) The reconstruction of evolution (abst.). Ann. Hum. Genet. 27, 105. 292 Thompson, E.A. (1973) The method of minimum evolution. Ann. Hum. Genet. 36, 333-340. 293 Lemey, P. et al. (2009) The Phylogenetic Handbook - A practical approach to phylogenetic analysis and hypothesis testing, 2nd Ed. Cambridge University Press. 294 Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 39, 783-791. 295 Jukes, T.H. and Cantor, C.R. (1969) Evolution of protein molecules. In Mammalian Protein Metabolism (Munro, H.N., ed), pp. 21-132, Academic Press. 296 Zhang, Z. et al. (2000) A greedy algorithm for aligning DNA sequences. J. Comput. Biol. 7, 203-214. 297 Elbing, K. and Brent, R. (2002) Media preparation and bacteriological tools. Curr. Protoc. Mol. Biol. 59, 1.1.1-1.1.7. 298 Oblinger, J.L. and Koburger, J.A. (1975) Understanding and teaching the Most Probable Number

91 technique. J. Milk Food Technol. 38, 540-545. 299 Blodgett, R. (2010) FDA, Bacterial Analytical Manual, Appendix 2: Most probable number from serial dilutions. 300 USDA (2014) Microbiology Laboratory Guidebook, Appendix 2.05: Most probable number procedure and tables. 301 de Man, J.C. (1983) MPN tables, corrected. J. Appl. Microbiol. Biotechnol. 17, 301-305. 302 Campbell, M.K. and Farrel, S.O. (2012) Biochemistry, 8th Ed. CENGACE Learning. 303 Makinen, K.K. (2010) Sugar Alcohols, caries incidence, and remineralization of caries lesions: a literature review. Int. J. Dent. Vol 2010, Article 981072. 304 Burg, M.B. and Ferraris, J.D. (2008) Intracellular organic osmolytes: function and regulation. J. Biol. Chem. 283, 7309-7313. 305 Schiweck, H. et al. (2012) Sugar Alcohols. In Ullmann’s Encyclopedia of Industrial Chemistry, Electronic Release, pp. 1-37, Wiley-VCH. 306 Christoph, R. et al. (2012) Glycerol. In Ullmann’s Encyclopedia of Industrial Chemistry, Electronic Release, pp. 67-82, Wiley-VCH. 307 Kumdam, H. et al. (2014) Arabitol production by microbial fermentation - biosynthesis and future applications. Int. J. Sci. Appl. Res. 1, 1-12. 308 O’Donnell, K. and Kearsley, M.W. (2012) Sweeteners and Sugar Alternatives in Food Technology, 2nd Ed. Wiley-Blackwell. 309 Madsen, A.Ø. and Larsen, S. (2005) Ribitol and xylitol: explaining the differences in physical chemical properties. Acta Crystallogr. 310 Madsen, A.Ø. et al. (2011) Understanding thermodynamic properties at the molecular level: multiple temperature charge density study of ribitol and xylitol. J. Phys. Chem. A 115, 7794-7804. 311 Bates, T.R. et al. (1973) Kinetics of hydrolysis of polyoxyethylene (20) sorbitan fatty acid ester surfactants. J. Pharm. Pharmacol. 25, 470-477. 312 Larsen, H. (1986) Halophilic and halotolerant microorganisms - an overview and historical perspective. FEMS Microbiol. Rev. 39, 3-7. 313 Wu, H.-S. and Tsai, M.-J. (2004) Kinetics of tributyrin hydrolysis by lipase. Enzyme Microb. Technol. 35, 488-493. 314 Jurado, E. et al. (2006) Kinetic model for the enzymatic hydrolysis of tributyrin in O / W emulsions. Chem. Eng. Sci. 61, 5010-5020. 315 Mateos-Díaz, E. et al. (2012) High-throuhput screening method for lipases/esterases. In Lipases and Phospholipases. Methods in Molecular Biology (Methods and Protocols) 861 (Sandoval, G., ed), pp. 403-431, Humana Press. 316 Singh, B. and Marshall, R.T. (1966) Bacterial deoxyribonuclease production and its possible influence on mastitis detection. J. Dairy Sci. 49, 822-825. 317 Sumby, P. et al. (2005) Extracellular deoxyribonuclease made by group A Streptococcus assists pathogenesis by enhancing evasion of the innate immune response. PNAS 102, 1679-1684.

92 318 Buchanan, J.T. et al. (2006) DNase expression allows the pathogen Group A Streptococcus to escape killing in neutrophil extracellular traps. Curr. Biol. 16, 396-400. 319 Hasegawa, T. et al. (2010) Characterization of a virulence-associated and cell-wall-located DNase of Streptococcus pyogenes. Microbiology 156, 184-190. 320 Palmer, L.J. et al. (2012) Extracellular deoxyribonuclease production by periodontal bacteria. J. Periodontal Res. 47, 439-445. 321 Fox, J.B. and Holtman, D.F. (1968) Effect of anaerobiosis on staphylococcal nuclease production. J. Bacteriol. 95, 1548-1550. 322 Béguin, P. and Aubert, J.-P. (1994) The biological degradation of cellulose. FEMS Microbiol. Lett. 13, 25-58. 323 Hakovirta, J.R. et al. (2016) Identification and analysis of informative single nucleotide polymorphisms in 16S rRNA gene sequences of the Bacillus cereus group. J. Clin. Microbiol. 54, 2749-2756. 324 Ash, C. et al. (1991) Phylogenetic heterogeneity of the genus Bacillus revealed by comparative analysis of small-subunit-ribosomal RNA sequences. Lett. Appl. Microbiol. 13, 202-206. 325 Xu, D. and Côté, J.C. (2003) Phylogenetic relationships between Bacillus species and related genera inferred from comparison of 3′ end 16S rDNA and 5′ end 16S-23S ITS nucleotide sequences. Int. J. Syst. Evol. Microbiol. 53, 695-704. 326 Wang, A. and Ash, G.J. (2015) Whole genome phylogeny of Bacillus by Feature Frequency Profiles (FFP). Sci. Rep. 5, 1-14. 327 Dunlap, C.A. et al. (2016) Bacillus nakamurai sp. nov., a black-pigment-producing strain. Int. J. Syst. Evol. Microbiol. 66, 2987-2991. 328 Roberts, M.S. et al. (1996) Bacillus vallismortis sp. nov., a close relative of Bacillus subtilis, isolated from soil in Death Valley, California. Int. J. Syst. Bacteriol. 46, 470-475. 329 Badnore, A.U. et al. (2019) Preparation of antibacterial peel-off facial mask formulation incorporating biosynthesized silver nanoparticles. Appl. Nanosci. 9, 279-287. 330 Kaur, P.K. et al. (2015) Antifungal potential of Bacillus vallismortis R2 against different phytopathogenic fungi. Spanish J. Agric. Res. 13, e1004. 331 Kaur, P.K. et al. (2017) Evaluation of Bacillus vallismortis (Bacillales: Bacillaceae) R2 as insecticidal agent against polyphagous pest Spodoptera litura (Lepidoptera: Noctuidae). 3 Biotech 7, 346. 332 Kim, O.-S. et al. (2012) Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured species. Int. J. Syst. Evol. Microbiol. 62, 716-721. 333 Kloos, W.E. and Wolfshohl, J.F. (1991) Staphylococcus cohnii subspecies: Staphylococcus cohnii subsp. cohnii nov. and Staphylococcus cohnii subsp. urealyticum subsp. nov. Int. J. Syst. Bacteriol. 41, 284-289. 334 Shahandeh, Z. et al. (2015) Association of Staphylococcus cohnii subspecies urealyticum infection with recurrence of renal staghorn stone. Casp. J. Intern. Med. 6, 40-42. 335 Garg, S. (2017) Staphylococcus cohnii: not so innocuous. J. Acute Dis. 6, 239-240. 336 Szewczyk, E.M. et al. (2003) Potential role of Staphylococcus cohnii in a hospital environment. Microb.

93 Ecol. Health Dis. 15, 51-56. 337 Schleifer, K.H. and Kloos, W.E. (1975) Isolation and characterization of staphylococci from human skin. I. Amended descriptions of Staphylococcus epidermidis and Staphylococcus saprophyticus and descriptions of three new species: Staphylococcus cohnii, Staphylococcus haemolyticus, and Staphylococcus xylosus. Int. J. Syst. Bacteriol. 25, 50-61. 338 Hajek, V. et al. (1996) Staphylococcus saprophyticus subsp. bovis subsp. nov, isolated from bovine nostrils. Int. J. Syst. Bacteriol. 46, 792-796. 339 Pantüček, R. et al. (2018) Staphylococcus edaphicus sp. nov., isolated in Antarctica, harbors the mecC gene and genomic islands with a suspected role in adaptation to extreme environments. Appl. Environ. Microbiol. 84, e01746-17. 340 Pead, L. et al. (1985) Staphylococcus saprophyticus as a urinary pathogen: a six year prospective survey. Br. Med. J. 291, 1157-1159. 341 Kuroda, M. et al. (2005) Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection. PNAS 102, 13272-13277. 342 Hur, J. et al. (2016) Staphylococcus saprophyticus bacteremia originating from urinary tract infections: a case report and literature review. J. Infect. Chemother. 48, 136-139. 343 Sun, L.-N. et al. (2013) Comamonas jiangduensis sp. nov., a biosurfactant-producing bacterium isolated from agricultural soil. Int. J. Syst. Evol. Microbiol. 63, 2168-2173. 344 Wu, Y. et al. (2018) The core- and pan-genomic analyses of the genus Comamonas: from environmental adaptation to potential virulence. Front. Microbiol. 9, Article 3096. 345 Hickman-Brenner, F.W. et al. (1987) Aeromonas veronii, a new ornithine decarboxylase-positive species that may cause diarrhea. J. Clin. Microbiol. 25, 900-906. 346 Joseph, A.V. et al. (2013) Occurrence of potential pathogenic Aeromonas species in tropical seafood, aquafarms and mangroves off Cochin coast in South India. Vet. World 6, 300-306. 347 Cai, S.-H. et al. (2012) Characterization of pathogenic Aeromonas veronii bv. veronii associated with ulcerative syndrome from Chinese longsnout catfish (Leiocassis longirotris Günter). Brazilian J. Microbiol. 43, 382-388. 348 Janda, J.M. and Abbott, S.L. (2010) The genus Aeromonas: taxonomy, pathogenicity, and infection. Clin. Microbiol. Rev. 23, 35-73. 349 Yamada, S. et al. (1997) Incidence and clinical symptoms of Aeromonas-associated travellers’ diarrhoea in Tokyo. Epidemiol. Infect. 119, 121-126. 350 Rhee, J.Y. et al. (2016) Clinical and therapeutic implication of Aeromonas bacteremia: 14 years nation-wide experiences in Korea. Infect. Chemother. 48, 274-284. 351 Mencacci, A. et al. (2003) Aeromonas veronii biovar veronii septicaemia and acute suppurative cholangitis in a patient with hepatitis B. J. Med. Microbiol. 52, 727-730. 352 Shiina, Y. et al. (2004) An Aeromonas veronii biovar sobria infection with disseminated intravascular gas production. J. Infect. Chemother. 10, 37-41. 353 Roberts, M.T.M. et al. (2006) Aeromonas veronii biovar sobria bacteraemia with septic arthritis

94 confirmed by 16S rDNA PCR in an immunocompetent adult. J. Med. Microbiol. 55, 241-243. 354 Gröbner, S. et al. (2007) Severe diarrhoea caused by Aeromonas veronii biovar sobria in a patient with metastasised GIST. Polish J. Microbiol. 56, 277-279. 355 Hassan, A. et al. (2011) Aeromonas veronii biovar sobria gastoenteritis: a case report. Arch. Clin. Microbiol. 2, 1-3. 356 Ottaviani, D. et al. (2012) A severe case of Aeromonas veronii biovar sobria travellers’ diarrhoea characterized by co-isolation. J. Med. Microbiol. 62, 161-164. 357 Collins, M.D. et al. (1993) Aeromonas enteropelogenes and Aeromonas ichtiosmia are identical to Aeromonas trota and Aeromonas veronii respectively, as revealed by small-subunit rRNA sequence analysis. Int. J. Syst. Bacteriol. 43, 855-856. 358 Carnahan, A.M. et al. (1991) Aeromonas trota sp. nov., an ampicillin-susceptible species isolated from clinical specimens. J. Clin. Microbiol. 29, 1206-1210. 359 Martin-Carnahan, A. and Joseph, S.W. (2015) Genus Aeromonas. In Bergey’s Manual of Systematics of Archaea and Bacteria, pp. 1-44, John Wiley & Sons. 360 De Luca, F. et al. (2010) Genetic and biochemical characterization of TRU-1, the endogenous class C β-lactamase from Aeromonas enteropelogenes. Antimicrob. Agents Chemother. 54, 1547-1554. 361 Reina, J. and Lopez, A. (1996) Gastroenteritis caused by Aeromonas trota in a child. J. Clin. Pathol. 49, 173-175. 362 Lai, C.-C. et al. (2007) Wound infection and septic shock due to Aeromonas trota in a patient with liver cirrhosis. Clin. Infect. Dis. 44, 1523-1524. 363 Martin, R.M. and Bachman, M.A. (2018) Colonization, infection, and the accessory genome of . Front. Cell. Infect. Microbiol. 8, Article 4. 364 Seiffert, S.N. et al. (2019) First clinical case of KPC-3-producing Klebsiella michiganensis in Europe. New Microbes New Infect. 29, 100516. 365 Singh, L. et al. (2016) Klebsiella oxytoca: an emerging pathogen? Med. J. Armed Forces India 72, S59-S61. 366 Brady, C. et al. (2014) Gibbsiella greigii sp. nov., a novel species associated with oak decline in the USA. Syst. Appl. Microbiol. 37, 417-422. 367 Jain, A.K. and Yadav, R. (2018) First report of isolation and antibiotic susceptibility pattern of Raoultella electrica from table eggs in Jaipur, India. New Microbes New Infect. 21, 95-99. 368 Dickey, R.S. and Zumoff, C.H. (1988) Emended description of Enterobacter cancerogenus comb. nov. (formerly Erwinia cancerogena). Int. J. Syst. Bacteriol. 38, 371-374. 369 Schønheyder, H.C. et al. (1994) Taxonomic notes: synonymy of Enterobacter cancerogenus (Urosević 1966) Dickey and Zumoff 1988 and Enterobacter taylorae Farmer et al. 1985 and resolution of an ambiguity in the biochemical profile. Int. J. Syst. Bacteriol. 44, 586-587. 370 Rubinstien, E.M. et al. (1993) Enterobacter tayloreae, a new opportunistic pathogen: report of four cases. J. Clin. Microbiol. 31, 249-254. 371 Abbott, S.L. and Janda, J.M. (1997) Enterobacter cancerogenus ('Enterobacter taylorae’) Infections

95 associated with severe trauma or crush injuries. Microbiol. Infect. Dis. 107, 359-361. 372 Garazinno, S. et al. (2005) Osteomyelitis caused by Enterobacter cancerogenus infection following a traumatic injury: case report and review of the literature. J. Clin. Microbiol. 43, 1459-1461. 373 Demir, T. et al. (2014) Pneumonia due to Enterobacter cancerogenus infection. Folia Microbiol. (Praha). 59, 527-530. 374 Clermont, D. et al. (2015) Multilocus sequence analysis of the genus Citrobacter and description of Citrobacter pasteurii sp. nov. Int. J. Syst. Evol. Microbiol. 65, 1486-1490. 375 Samonis, G. et al. (2009) Citrobacter infections in a general hospital: characteristics and outcomes. Eur. J. Clin. Microbiol. Infect. Dis. 28, 61-68. 376 Ribeiro, T.G. et al. (2017) Citrobacter portucalensis sp. nov., isolated from an aquatic sample. Int. J. Syst. Evol. Microbiol. 67, 3513-3517. 377 Carr, E.L. et al. (2003) Seven novel species of Acinetobacter isolated from activated sludge. Int. J. Syst. Evol. Microbiol. 53, 953-963. 378 Chen, T.-L. et al. (2008) Acinetobacter baylyi as a pathogen for opportunistic infection. J. Clin. Microbiol. 46, 2938-2944. 379 Murray, C.K. and Hospenthal, D.R. (2008) Acinetobacter infection in the ICU. Crit. Care Clin. 24, 237-248. 380 Towner, K.J. (2009) Acinetobacter: an old friend, but a new enemy. J. Hosp. Infect. 73, 355-363. 381 Singh, N.K. et al. (2014) Genome sequencing and annotation of Acinetobacter gerneri strain MTCC 9824T. Genomics Data 2, 7-9. 382 Howard, G.T. et al. (2012) Growth of Acinetobacter gerneri P7 on polyurethane and the purification and characterization of a polyurethanase enzyme. Biodegradation 23, 561-573. 383 Xie, F. et al. (2014) Pseudomonas kunmingensis sp. nov., an exopolysaccharide-producing bacterium isolated from a phosphate mine. Int. J. Syst. Evol. Microbiol. 64, 559-564. 384 Muratova, A. et al. (2015) The coupling of the plant and microbial catabolisms of phenanthrene in the rhyzosphe of Medicago sativa. J. Plant Physiol. 188, 1-8. 385 Elomari, M. et al. (1997) Pseudomonas monteilii sp. nov., isolated from clinical specimens. Int. J. Syst. Bacteriol. 47, 846-852. 386 Aditi et al. (2017) Exacerbation of bronchiectasis by Pseudomonas monteilii: a case report. BMC Infect. Dis. 17, 511. 387 Gupta, A. et al. (2018) Pseudomonas monteilii an emerging pathogen in meningoencephalitis. J. Clin. Diagnostic Res. 12, DD04-DD05. 388 Ocampo-Sosa, A.A. et al. (2015) Isolation of VIM-2-producing Pseudomonas monteilii clinical strains disseminated in a tertiary hospital in northern Spain. Antimicrob. Agents Chemother. 59, 1334-1336. 389 Bogaerts, P. et al. (2011) IMP-13-producing Pseudomonas monteilii recovered in a hospital environment. J. Antimicrob. Chemother. 66, 2434-2435. 390 Dharni, S. et al. (2014) Impact of plant growth promoting Pseudomonas monteilii PsF84 and Pseudomonas plecoglossicida PsF610 on metal uptake and production of secondary metabolite

96 (monoterpenes) by rose-scented geranium (Pelargonium graveolens cv. bourbon) grown on tannery sludge. Chemosphere 117, 433-439. 391 Stolz, A. et al. (2007) Pseudomonas knackmussii sp. nov. Int. J. Syst. Evol. Microbiol. 57, 572-576. 392 Wang, Y.-N. et al. (2012) Pseudomonas nitritireducens sp. nov., a nitrite reduction bacterium isolated from wheat soil. Arch. Microbiol. 194, 809-813. 393 Iizuka, H. and Komagata, K. (1964) Microbiological studies on petroleum and natural gas. I. Determination of hydrocarbon-utilizing bacteria. J. Gen. Appl. Microbiol. 10, 207-221. 394 Lang, E. et al. (2007) Characterization of “Pseudomonas azelaica” DSM 9128, leading to emended descriptions of Pseudomonas citronellolis Seubert 1960 (Approved Lists 1980) and Pseudomonas nitroreducens lizuka and Komagata 1964 (Approved Lists 1980), including Pseudomonas multir. Int. J. Syst. Evol. Microbiol. 57, 878-882. 395 Miyazaki, R. et al. (2015) Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds. Environ. Microbiol. 17, 91-104. 396 Wolf, A. et al. (2002) Stenotrophomonas rhizophila sp. nov., a novel plant-associated bacterium with antifungal properties. Int. J. Syst. Evol. Microbiol. 52, 1937-1944. 397 Schmidt, C.S. et al. (2012) Stenotrophomonas rhizophila DSM14405T promotes plant growth probably by altering fungal communities in the rhizosphere. Biol. Fertil. Soils 48, 947-960. 398 Alavi, P. et al. (2013) Root-microbe systems: the effect and mode of interaction of Stress Protecting Agent (SPA) Stenotrophomonas rhizophila DSM14405T. Front. Plant Sci. 4, Article 141. 399 Brooke, J.S. (2012) Stenotrophomonas maltophilia: an emerging global opportunistic pathogen. Clin. Microbiol. Rev. 25, 2-41. 400 Yang, C. et al. (2017) Simultaneous hydrolysis of carbaryl and chlorpyrifos by Stenotrophomonas sp. strain YC-1 with surface-displayed carbaryl hydrolase. Sci. Rep. 7, Article 13391. 401 Zaffar, H. et al. (2018) Kinetics of endosulfan biodegradation by Stenotrophomonas maltophilia EN-1 isolated from pesticide-contaminated soil. Soil Sediment Contam. 27, 1-13. 402 Stackebrandt, E. (1992) Unifying Phylogeny and Phenotypic Diversity. In The Prokaryotes: A Hanbook on the Biology of Bacteria: Ecophysiology, Isolation, Identification, Applications 2nd Ed., (Ballows, A. et al., eds), pp. 19-47, Springer-Verlag. 403 Moran, S. (2018) An Applied Guide to Water and Effluent Treatment Plant Design, Elsevier. 404 Numberger, D. et al. (2019) Characterization of bacterial communities in wastewater with enhanced taxonomic resolution by full-length 16S rRNA sequencing. Sci. Rep. 9, Article 9673. 405 Haaber, J. et al. (2012) Planktonic aggregates of Staphylococcus aureus protect against common antibiotics. PLoS One 7, e41075. 406 Kragh, K.N. et al. (2016) Role of multicellular aggregates in biofilm formation. MBio 7, e00237-16. 407 Cydzik-Kwiatkowska, A. and Zielińska, M. (2016) Bacterial communities in full-scale wastewater treatment systems. World J. Microbiol. Biotechnol. 32, Article 66. 408 Saunders, A.M. et al. (2016) The activated sludge ecosystem contains a core community of abundant organisms. ISME J. 10, 11-20.

97 409 Rittstieg, K. et al. (2001) Aerobic treatment of a concentrated urea wastewater with simultaneous stripping of ammonia. Appl. Microbiol. Biotechnol. 56, 820-825. 410 Liu, Y. et al. (2019) The roles of free ammonia (FA) in biological wastewater treatment processes: a review. Environ. Int. 123, 10-19. 411 Olkiewicz, M. et al. (2015) Efficient extraction of lipids from primary sewage sludge using ionic liquids for biodiesel production. Sep. Purif. Technol. 153, 118-125. 412 Farghaly, A. et al. (2017) Bioethanol production from paperboard mill sludge using acid-catalyzed bio-derived choline acetate ionic liquid pretreatment followed by fermentation process. Energy Convers. Manag. 145, 255-264. 413 Jaria, G. et al. (2017) Sludge from paper mill effluent treatment as raw material to produce carbon adsorbents: an alternative waste management strategy. J. Environ. Manage. 188, 203-211. 414 Glińska, K. et al. (2019) Separation of cellulose from industrial paper mill wastewater dried sludge using a commercial and cheap ionic liquid. Water Sci. Technol. 79, 1897-1904. 415 Ruiken, C.J. et al. (2013) Sieving wastewater - cellulose recovery, economic and energy evaluation. Water Res. 47, 43-48. 416 Chowdhury, S. et al. (2011) Novel microbial consortium for laboratory scale lead removal from city effluent. J. Environ. Sci. Technol. 4, 41-54. 417 Ding, C.-Q. et al. (2017) Study on community structure of microbial consortium for the degradation of viscose fiber wastewater. Bioresour. Bioprocess. 4, 1-9. 418 Li, M. et al. (2013) Heavy metal removal by biomineralization of urease producing bacteria isolated from soil. Int. Biodeterior. Biodegrad. 76, 81-85. 419 Martin, N.H. et al. (2016) The evolving role of coliforms as indicators of unhygienic processing conditions in dairy foods. Front. Microbiol. 7, Article 1549. 420 International Organization for Standardization (2014) ISO 9308-1:2014 Water quality - Enumeration of Escherichia coli and coliform bacteria - Part 1: Membrane filtration method for waters with low bacterial background flora. 421 Niyonzima, F.N. and More, S.S. (2014) Detergent-compatible bacterial amylases. Appl. Biochem. Biotechnol. 174, 1215-1232. 422 Guerrand, D. (2017) Lipases industrial applications: focus on food and agroindustries. OCL - Oilseeds fats, Crop. Lipids 24, D403.

98 Appendices

99 Appendix A List of hydrolases

Supplementary Table 1. EC number, accepted name, and systematic name for the hydrolases referred to in the present thesis. EC Number* Accepted Name Systematic Name EC 3.1.1.1 carboxylesterase carboxylic-ester hydrolase EC 3.1.1.3 triacylglycerol lipase triacylglycerol acylhydrolase EC 3.1.1.11 pectinesterase pectin pectylhydrolase EC 3.1.12.1 5' to 3' exodeoxyribonuclease (nucleoside 3'-phosphate- forming)# EC 3.2.1.1 α-amylase 4-α-D-glucan glucanohydrolase EC 3.2.1.2 β-amylase 4-α-D-glucan maltohydrolase EC 3.2.1.3 glucan 1,4-α-glucosidase 4-α-D-glucan glucohydrolase EC 3.2.1.4 cellulase 4-(1,3;1,4)-β-D-glucan 4-glucanohydrolase EC 3.2.1.6 endo-1,3(4)-β-glucanase 3-(1→3;1→4)-β-D-glucan 3(4)-glucanohydrolase EC 3.2.1.8 endo-1,4-β-xylanase 4-β-D-xylan xylanohydrolase EC 3.2.1.11 dextranase 6-α-D-glucan 6-glucanohydrolase EC 3.2.1.15 polygalacturonase (1→4)-α-D-galacturonan glycanohydrolase EC 3.2.1.20 α-glucosidase α-D-glucoside glucohydrolase EC 3.2.1.21 β-glucosidase β-D-glucoside glucohydrolase EC 3.2.1.23 β-galactosidase β-D-galactoside galactohydrolase EC 3.2.1.37 xylan 1,4-β-xylosidase 4-β-D-xylan xylohydrolase EC 3.2.1.40 α-L-rhamnosidase α-L-rhamnoside rhamnohydrolase EC 3.2.1.41 pullulanase pullulan 6-α-glucanohydrolase EC 3.2.1.52 β-N-acetylhexosaminidase β-N-acetyl-D-hexosaminide N- acetylhexosaminohydrolase EC 3.2.1.55 non-reducing end α-L-arabinofuranoside non-reducing end α-L-arabinofuranosidase α-L-arabinofuranosidase EC 3.2.1.67 galacturan 1,4-α-galacturonidase poly[(1→4)-α-D-galacturonide] galacturonohydrolase EC 3.2.1.68 isoamylase glycogen α-1,6-glucanohydrolase EC 3.2.1.73 licheninase (1→3)-(1→4)-β-D-glucan 4-glucanohydrolase EC 3.2.1.81 β-agarase agarose 4-glycanohydrolase EC 3.2.1.82 exo-poly-α-galacturonosidase poly[(1→4)-α-D-galactosiduronate] digalacturonohydrolase EC 3.2.1.91 cellulose 1,4-β-cellobiosidase 4-β-D-glucan cellobiohydrolase (non-reducing end) (non-reducing end) EC 3.2.1.95 dextran 1,6-α-isomaltotriosidase 6-α-D-glucan isomaltotriohydrolase EC 3.2.1.108 lactase lactose galactohydrolase EC 3.2.1.139 α-glucuronidase α-D-glucosiduronate glucuronohydrolase EC 3.2.1.142 limit dextrinase dextrin 6-α-glucanohydrolase EC 3.2.1.158 α-agarase agarose 3-glycanohydrolase EC 3.4.24.24 gelatinase A EC 3.4.24.35 gelatinase B EC 3.5.1.5 urease urea amidohydrolase * Data retrieved from the Enzyme database maintained by the School of Biological and Chemical Sciences, Queen Mary University of London (hosted at https://www.qmul.ac.uk/sbcs/iubmb/enzyme/). # 5' to 3' exodeoxyribonuclease (nucleoside 3'-phosphate-forming), gelatinase A, and gelatinase B do not have a systematic name.

100 Appendix B Applications of relevant hydrolases

Supplementary Table 2. Applications of some industrial and biotechnological relevant hydrolases. Group of Enzymes Applications* Lipases and Food industry (cheese flavoring, infant food production, emulsifying power Esterases improvement of lipids, vegetal oil processing). Production of active pharmaceutical compounds. Removal of waxes in paper and pulp industry. In detergents (digestion of lipid stains). Production of biodegradable polymers (polylactide, polycaprolactone). Biodiesel production from vegetal oil or animal fats. Wastewater treatment. Starch degrading Liquefaction and saccharification of starch for glucose production. enzymes Production of maltose and glucose syrups. Production of baking goods. Juice clarification. Biofuel production (ethanol) from corn starch, rice, and other starchy crops. In detergents (digestion of starchy stains). Removal of starchy materials in paper and pulp industry. Textile desizing. Lignocellulose Animal feed production (to increase digestibility). degrading enzymes Biofuel conversion (second-generation ethanol production). In detergents (removal of fibrils to improve the overall appearance of fabrics). Paper and pulp industry, including chlorine-free bleaching. Textile industry (modification of cellulosic fibers). Pectin degrading Extraction and clarification of fruit and vegetable fruits, including wine production. enzymes Animal feed production (to lower viscosity and increase nutrient absorption). Hydrolysis of pectin in agroindustrial wastes in bioethanol production. Extraction of vegetable oils. Degumming of fibers; scouring cotton, and retting of flax and hemp, in textile production. Agarases Production of agar oligosaccharides for low-calorie food and food stabilizers. Proteolytic enzymes Food industry (cheese making, infant food, low allergenic milk protein production). Animal feed production. In detergents (digestion of protein stains). Leather treatment (in dehairing, bating, and tanning processes). Silk degumming. Wool treatment. Waste treatment. Ureases In fermented beverages, like wine, to prevent the production of ethyl carbamate. Determination of urea in biological fluids. In diagnostics and therapeutics. For pollution control, in biosensors, and for wastewater treatment. * Data retrieved from [13,19,37,54,63,65,66,70,112,130,149,152,166,421,422].

101 Appendix C BioTask Bioremediation Culture collection

G1 (M13 & PH) M13 PH G2 (M13 & PH) M13 PH G1 (M 13 & PH) M 13 PH

G3 (M13 & PH) M13 PH G4 (M13 & PH) M13 PH

Supplementary Figure 1. Composite dendrograms for all BBC strains. Data obtained from PCR fingerprinting profiles with primers csM13 and PH was integrated. The dendrograms were generated using the Pearson product-moment coefficient to compute the similarity between isolates, and UPGMA as the clustering method. From top left: G1, Gram-positive rods (n = 66); G2, Gram-positive coccus (n = 71); G3, Gram-negative rods (n = 48); G4, Gram-negative coccus (n = 17). Scales correspond to the global percentage of similarity. Courtesy of Pedro Teixeira.

102 Appendix D Rapid growth assay – Principal component analysis

A 36h versus 24h B 36h versus 28h C 36h versus 32h 75 75 75

60 60 60

45 45 45

30 30 30

NAUC TSB 24h TSB NAUC 28h TSB NAUC 15 15 32h TSB NAUC 15 y = 0.6468x - 2.4118 y = 0.7666x - 1.7315 y = 0.8701x - 1.0501 R² = 0.9589 R² = 0.984 R² = 0.9956 0 0 0 0 15 30 45 60 75 0 15 30 45 60 75 0 15 30 45 60 75 NAUC TSB 36h NAUC TSB 36h NAUC TSB 36h

D 36h versus 40h E 36h versus 44h F 36h versus 48h 75 75 75

60 60 60

45 45 45

30 30 30

NAUC TSB 40h TSB NAUC 44h TSB NAUC 15 15 48h TSB NAUC 15 y = 1.1095x + 1.1422 y = 1.2157x + 2.4356 y = 1.3213x + 3.7695 R² = 0.9973 R² = 0.9899 R² = 0.9784 0 0 0 0 15 30 45 60 75 0 15 30 45 60 75 0 15 30 45 60 75 NAUC TSB 36h NAUC TSB 36h NAUC TSB 36h Supplementary Figure 2. Scatterplots showing the relation between NAUCs at 36 h and 24-48 h in TSB. A) 36 h versus 24 h (r = 0.979). B) 36 h versus 28 h (r = 0.992). C) 36 h versus 32 h (r = 0.998). D) 36 h versus 40 h (r = 0.999). E) 36 h versus 44 h (r = 0.995). F) 36 h versus 48 h (r = 0.989). All p-values < 0.001.

A 36h versus 24h B 36h versus 28h C 36h versus 32h 75 75 75

60 60 60

45 45 45

30 30 30

NAUC NB 24h NB NAUC 28h NB NAUC NAUC NB 32h NB NAUC 15 15 15 y = 0.6585x - 2.2957 y = 0.7787x - 1.8066 y = 0.8811x - 1.2244 R² = 0.9405 R² = 0.9767 R² = 0.9931 0 0 0 0 15 30 45 60 75 0 15 30 45 60 75 0 15 30 45 60 75 NAUC NB 36h NAUC NB 36h NAUC NB 36h

D 36h versus 40h E 36h versus 44h F 36h versus 48h 75 75 75

60 60 60

45 45 45

30 30 30

NAUC NB 40h NB NAUC 44h NB NAUC NAUC NB 48h NB NAUC 15 15 15 y = 1.1015x + 1.2396 y = 1.2012x + 2.5549 y = 1.3032x + 3.7972 R² = 0.9955 R² = 0.9844 R² = 0.9701 0 0 0 0 15 30 45 60 75 0 15 30 45 60 75 0 15 30 45 60 75 NAUC NB 36h NAUC NB 36h NAUC NB 36h Supplementary Figure 3. Scatterplots showing the relation between NAUCs at 36 h and 24-48 h in NB. A) 36 h versus 24 h (r = 0.970). B) 36 h versus 28 h (r = 0.988). C) 36 h versus 32 h (r = 0.997). D) 36 h versus 40 h (r = 0.998). E) 36 h versus 44 h (r = 0.992). F) 36 h versus 48 h (r = 0.985). All p-values < 0.001.

103 Supplementary Table 3. Eigenvalues and respective percentage of variance for the first 13 PCs. Principal Cumulative Eigenvalue Variance (%) Component Variance (%) PC1 16.8226 60.08 60.08 PC2 4.9391 17.64 77.72 PC3 2.2640 8.09 85.81 PC4 1.2241 4.37 90.18 PC5 1.0682 3.82 93.99 PC6 0.8547 3.05 97.05 PC7 0.3689 1.32 98.36 PC8 0.2257 0.81 99.17 PC9 0.1304 0.47 99.63 PC10 0.0548 0.20 99.83 PC11 0.0220 0.08 99.91 PC12 0.0156 0.06 99.96 PC13 0.0127 0.05 ≈100

Supplementary Table 4. Explanatory variables, and their explanatory value, for PC1 and PC2. Variables PC1 PC2 Variables PC1 PC2 NAUC TSB 4h -0.770 NAUC NB 4h -0.668 NAUC TSB 8h -0.868 NAUC NB 8h -0.797 NAUC TSB 12h -0.892 NAUC NB 12h -0.874 NAUC TSB 16h -0.902 NAUC NB 16h -0.892 NAUC TSB 20h -0.901 NAUC NB 20h -0.885 NAUC TSB 24h -0.891 NAUC NB 24h -0.869 NAUC TSB 28h -0.877 NAUC NB 28h -0.849 0.504 NAUC TSB 32h -0.863 NAUC NB 32h -0.832 0.519 NAUC TSB 36h -0.844 NAUC NB 36h -0.811 0.530 NAUC TSB 40h -0.828 -0.505 NAUC NB 40h -0.791 0.538 NAUC TSB 44h -0.812 -0.511 NAUC NB 44h -0.769 0.546 NAUC TSB 48h -0.781 -0.535 NAUC NB 48h -0.751 0.552

PC 1 x PC 2 1.5 NAUC NB

1.0

BBC|189 BBC|210 BBC|038 BBC|150 BBC|162 BBC|167 0.5 BBC|118 BBC|186 BBC|044 BBC|163 BBC|056 BBC|187 BBC|128 BBC|003 BBC|042 BBC|023 BBC|028 BBC|008 BBC|032 BBC|111 BBC|152 0.0 BBC|024 BBC|205 BBC|110 BBC|046 BBC|161 BBC|048 BBC|027 BBC|033 BBC|020 BBC|170 BBC|034 BBC|031 BBC|005 BBC|021 BBC|188 BBC|184 BBC|077 BBC|029

-0.5 BBC|035 BBC|006 Principal Component 2 (17.6%)2 Component Principal BBC|177 BBC|169 BBC|016

-1.0

BBC|148 NAUC TSB -1.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Principal Component 1 (60.1%) NAUC TSB & NAUC NB

Supplementary Figure 4. Spatial dispersion of the initial set in the PCA space defined by PC1 and PC2. PC1 corresponds to 60.1% of the total variance of the original data; the values in the axis increases as the NAUCs for both TSB and NB decreases (as shown by the corresponding grey arrow). PC2 corresponds to 17.6% of the total variance of the original data. The values in the axis increases with the NAUCs for NB from 28 h of incubation and decreases with NAUCs in TSB (as shown by the corresponding grey arrow) from 40 h of incubation.

104 Appendix E Phenotypic characterization of strains in the working set

Supplementary Table 5. NAUCs for 36 h of incubation in TSB and for all substrates in the sole carbon source assay.

Reference Monosaccharides Disaccharides Sugar Alcohols Esters

Aldoses Ketoses

TSB* Ara Gal Glu Man Rib Xyl Fru Sor Cello Lac AraL GalL GlyL InoL ManL RibL SorL XylL MB T20 T40 T60 T80 BBC|003 42.15 33.23 37.25 38.72 28.24 32.69 38.39 31.59 8.22 35.76 15.71 0.35 0.00 24.37 27.59 40.66 4.88 28.98 0.37 0.67 4.60 4.41 10.23 10.76 BBC|005 37.76 26.25 10.50 18.90 15.43 30.65 13.48 11.71 4.46 3.28 4.43 3.51 0.77 11.84 0.00 5.26 1.16 13.97 0.00 2.95 14.70 20.06 0.00 19.02 BBC|008 43.57 34.93 33.57 38.11 28.96 34.43 37.78 30.50 0.94 37.24 15.75 1.79 3.20 25.13 23.83 34.33 1.83 27.48 0.00 0.00 3.48 4.70 9.97 13.99 BBC|016 41.16 22.38 16.62 21.28 11.42 28.28 26.13 13.49 29.49 15.15 20.03 0.00 0.00 18.18 24.57 30.65 1.56 13.02 0.00 0.00 0.62 1.54 6.97 5.40 BBC|020 32.41 24.72 4.23 17.35 16.13 31.94 24.04 23.35 27.25 5.66 12.30 0.00 0.00 33.39 25.58 28.07 0.49 7.87 0.00 2.06 10.14 17.54 21.02 12.95 BBC|021 35.48 22.16 23.93 20.82 29.16 28.98 9.69 6.20 9.84 6.70 5.33 0.85 4.47 24.28 10.49 10.17 2.42 3.71 0.00 22.43 32.38 41.09 32.80 35.30 BBC|024 42.88 29.09 14.27 15.10 21.25 31.50 34.03 34.71 31.96 13.27 3.54 0.00 0.93 15.50 0.00 9.16 0.89 9.90 0.00 2.45 4.39 4.02 4.82 4.90 BBC|027 32.27 35.63 31.70 31.27 36.47 28.57 32.74 37.47 28.17 36.97 32.53 2.72 3.85 29.14 28.41 35.91 35.39 32.13 2.71 0.00 12.84 9.16 6.83 12.81 BBC|029 30.88 26.82 20.45 22.43 26.01 27.54 30.66 22.07 18.82 5.72 6.47 0.03 0.70 3.07 7.00 9.47 0.97 11.99 0.00 8.20 27.06 31.84 25.21 27.68 BBC|031 35.47 26.82 17.49 17.45 36.42 28.96 15.94 15.63 10.60 2.32 2.55 0.40 0.00 2.63 0.00 7.43 0.00 0.00 0.00 0.20 4.38 6.87 6.91 15.34 BBC|032 34.75 2.57 14.76 15.76 15.89 26.66 15.18 6.64 4.22 26.45 3.05 10.84 8.81 0.07 3.04 21.72 11.89 20.02 1.70 0.05 5.24 11.11 12.89 18.09 BBC|034 42.47 21.93 22.56 26.88 13.33 29.64 12.57 29.25 0.00 30.20 31.43 0.06 0.00 16.29 0.00 25.28 0.93 11.28 0.00 0.00 18.30 15.77 16.57 23.49 BBC|035 35.45 13.32 1.39 29.17 20.93 17.20 11.07 30.52 6.85 36.27 26.72 0.00 0.00 20.91 11.98 17.80 0.85 14.89 3.21 0.00 5.04 2.68 7.11 0.67 BBC|056 49.35 28.83 13.82 19.69 15.09 9.82 13.95 18.51 1.15 4.90 3.15 0.83 0.23 0.16 13.65 0.00 18.11 16.80 2.22 0.00 15.17 13.38 15.64 13.80 BBC|077 51.00 0.00 16.08 11.26 19.05 8.09 9.72 0.00 0.00 16.53 3.15 0.42 1.72 15.20 11.76 18.03 14.46 0.89 5.76 0.00 13.45 9.92 14.83 4.76 BBC|118 29.98 20.82 18.16 19.48 24.66 23.38 34.18 22.10 1.44 29.64 11.84 0.00 1.30 34.98 0.03 26.25 5.90 6.47 0.80 0.91 32.86 34.26 34.27 33.05 BBC|128 52.01 0.00 8.58 34.00 14.88 0.00 0.00 43.80 0.00 39.54 17.85 3.31 1.54 16.03 0.39 36.06 6.95 12.92 7.95 0.00 8.46 8.78 19.43 11.66 BBC|148 49.31 0.00 7.61 14.12 5.21 0.07 1.89 3.19 0.00 14.05 2.12 0.00 0.00 9.93 2.01 16.44 1.25 6.47 0.74 0.00 17.01 16.52 17.01 15.66 BBC|161 32.20 4.45 5.06 21.99 26.46 6.39 3.71 12.34 3.64 5.90 5.30 23.34 3.45 30.95 12.06 30.25 15.12 9.36 4.60 4.06 27.82 25.77 26.13 24.34 BBC|169 44.19 2.18 7.51 18.40 15.89 0.55 2.24 15.68 0.00 10.18 5.87 9.62 4.47 21.44 2.24 16.14 7.36 6.68 2.79 0.79 13.49 14.37 14.73 13.78 BBC|170 41.64 6.01 4.75 5.87 10.92 3.00 4.16 2.18 4.11 16.74 18.94 6.64 4.32 11.06 11.50 17.42 9.65 3.01 6.20 7.16 24.15 20.49 18.68 18.50 BBC|177 47.67 0.54 4.57 12.27 14.78 0.30 0.00 15.52 1.22 37.32 15.86 7.73 0.74 17.69 0.00 16.78 5.99 13.83 1.65 0.00 11.21 13.52 14.14 13.87 BBC|184 40.39 23.31 17.16 14.74 21.34 16.47 10.74 13.73 0.00 16.21 28.17 2.56 25.70 22.89 0.32 33.00 0.93 18.68 3.35 0.00 15.55 15.91 14.86 14.83 BBC|186 27.23 4.10 10.48 21.79 15.26 7.48 7.12 18.98 1.28 34.81 20.38 5.24 7.63 29.15 3.92 6.82 3.19 16.08 3.33 15.76 34.84 29.68 30.71 19.29 BBC|205 42.82 5.63 10.85 16.93 19.34 1.66 5.07 19.02 1.28 5.68 19.21 1.52 12.35 21.56 1.52 16.07 2.31 16.59 0.00 0.94 11.58 13.66 13.44 13.18 * TSB, Trypto-casein Soy Broth; Ara, arabinose; Gal, galactose; Glu, glucose; Man, mannose; Rib, ribose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; AraL, arabitol; GalL, galactitol; GlyL, glycerol; InoL, myo-inositol; ManL, mannitol; RibL, ribitol; SorL, sorbitol; XylL, xylitol; MB, methyl butyrate; T20, Tween 20; T40, Tween 40; T60, Tween 60; T80, Tween 80.

Supplementary Table 6. Descriptive statistics for NAUCs for 36 h of incubation in TSB and for all substrates in the sole carbon source assay.

TSB* Ara Gal Glu Man Rib Xyl Fru Sor Cello Lac AraL GalL GlyL InoL ManL RibL SorL XylL MB T20 T40 T60 T80 Maximum 52.01 35.63 37.25 38.72 36.47 34.43 38.39 43.80 31.96 39.54 32.53 23.34 25.70 34.98 28.41 40.66 35.39 32.13 7.95 22.43 34.84 41.09 34.27 35.30 Minimum 27.23 00.00 1.39 05.87 5.21 00.00 00.00 00.00 00.00 02.32 02.12 00.00 00.00 00.07 00.00 00.00 00.00 00.00 00.00 00.00 00.62 01.54 00.00 00.67 Mean 39.78 16.63 14.93 20.95 20.10 18.17 15.78 19.13 07.80 19.46 13.27 03.27 03.45 18.23 08.88 20.37 06.18 12.92 01.90 02.75 14.75 15.48 15.81 15.89 SD 07.00 12.65 09.43 08.15 7.82 12.88 12.64 11.49 10.53 13.35 09.70 05.26 05.62 10.00 09.93 11.19 07.98 08.25 02.29 05.48 09.98 10.38 08.82 08.28 CV 18% 76% 63% 39% 39% 71% 80% 60% 135% 69% 73% 161% 163% 55% 112% 55% 129% 64% 121% 199% 68% 67% 56% 52% * TSB, Trypto-casein Soy Broth; Ara, arabinose; Gal, galactose; Glu, glucose; Man, mannose; Rib, ribose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; AraL, arabitol; GalL, galactitol; GlyL, glycerol; InoL, myo-inositol; ManL, mannitol; RibL, ribitol; SorL, sorbitol; XylL, xylitol; MB, methyl butyrate; T20, Tween 20; T40, Tween 40; T60, Tween 60; T80, Tween 80; SD, standard deviation; CV, coefficient of variation.

105 Supplementary Table 7. NAUCs for biomass production, and effect of environmental factors (temperature, salinity, and pH) in strain growth.

Culture Media (120 h) Temperature (72 h) Salinity (120 h) pH (120 h)

NB* TSB LB BHI 20 °C 28 °C 37 °C 0% NaCl 1% NaCl 3% NaCl 5% NaCl 7% NaCl 9% NaCl 10% NaCl pH 3 pH 5 pH 7 pH 9 pH 11 BBC|003 92.37 120.61 77.56 90.02 73.67 74.18 51.87 74.30 70.95 60.17 44.32 37.20 28.22 21.53 0.00 101.44 120.17 78.71 39.19 BBC|005 93.02 147.96 81.07 115.88 99.18 92.68 47.38 62.05 90.91 102.50 78.34 24.48 15.94 2.59 0.49 133.22 144.87 116.32 76.99 BBC|008 95.94 118.90 79.02 96.26 69.26 70.00 55.18 72.14 85.59 81.39 43.02 35.69 30.19 18.90 0.27 119.95 116.87 85.79 42.51 BBC|016 63.23 87.94 62.06 69.97 35.79 34.03 40.16 57.16 55.78 57.61 26.04 14.47 18.22 10.38 0.85 18.63 86.61 67.69 32.95 BBC|020 74.89 71.02 56.96 100.25 38.56 38.80 38.03 36.93 58.67 42.83 27.02 27.85 19.88 16.55 0.00 18.68 70.71 60.92 32.60 BBC|021 71.35 108.00 60.24 77.75 52.32 51.51 43.91 48.20 68.62 61.10 22.58 16.45 2.42 2.61 0.00 68.52 107.26 92.95 80.70 BBC|024 69.47 112.70 59.18 58.54 44.21 58.83 37.68 69.26 30.27 13.44 22.65 4.99 1.35 2.02 0.00 27.86 111.23 85.30 54.53 BBC|027 47.61 75.93 47.66 48.62 71.71 74.44 30.54 95.94 40.58 30.36 18.00 14.13 11.81 5.48 1.02 51.19 74.08 64.61 42.69 BBC|029 95.28 186.09 73.93 90.10 77.45 89.91 36.41 57.65 38.65 11.97 19.58 2.86 4.21 2.80 0.00 65.57 188.65 95.41 61.86 BBC|031 86.13 102.48 73.69 108.05 18.36 31.90 46.71 42.41 45.85 17.99 18.01 7.29 4.41 2.23 0.79 85.85 101.12 80.22 51.01 BBC|032 105.37 124.04 90.80 110.92 57.26 58.00 39.93 76.28 86.33 73.20 38.73 30.88 1.94 1.61 0.00 144.91 121.92 89.28 79.86 BBC|034 75.93 95.96 73.73 67.43 18.12 25.00 17.86 89.04 92.82 54.14 18.29 13.99 0.62 0.42 0.59 9.87 94.72 106.57 68.39 BBC|035 81.18 118.38 63.97 117.91 37.84 57.43 52.37 97.41 60.97 80.31 105.61 61.93 44.82 17.42 37.97 142.36 121.89 75.02 10.30 BBC|056 45.11 84.66 47.90 64.43 39.83 34.91 34.35 78.37 52.12 79.98 16.07 17.46 17.24 14.08 0.20 8.34 83.25 48.96 11.06 BBC|077 48.04 96.54 39.22 62.30 39.13 33.91 38.53 66.66 63.04 37.60 25.56 23.73 17.79 23.66 0.00 7.20 94.86 67.52 36.90 BBC|118 70.78 115.32 61.45 48.97 78.26 46.71 17.59 55.89 65.00 35.81 17.02 6.41 1.47 0.99 0.00 35.46 113.50 81.14 42.27 BBC|128 21.35 50.21 16.35 29.89 30.75 39.49 34.08 46.72 28.42 26.80 10.49 13.53 6.63 9.48 0.47 16.21 48.84 36.08 29.37 BBC|148 40.34 73.36 41.04 61.96 54.70 46.44 32.74 79.16 29.51 26.02 21.37 14.57 17.65 11.44 0.93 5.88 71.63 50.88 26.94 BBC|161 96.34 130.84 84.01 98.05 59.18 67.68 41.19 95.28 78.54 72.56 36.26 26.80 5.95 2.51 0.00 153.09 127.61 88.32 64.20 BBC|169 73.38 104.22 63.06 69.47 40.61 47.06 40.25 68.95 66.69 36.93 34.64 29.25 18.76 22.37 0.39 90.94 101.64 83.31 75.21 BBC|170 85.11 165.86 98.75 141.41 74.20 83.63 35.32 99.87 82.06 77.88 23.27 16.31 3.28 0.51 0.00 147.67 166.17 136.97 97.89 BBC|177 93.27 191.31 66.34 114.71 45.31 55.83 45.53 79.73 94.89 42.09 23.36 21.67 15.56 16.10 0.00 27.18 194.18 148.17 106.23 BBC|184 84.86 175.81 74.35 105.61 74.52 94.00 51.85 85.99 74.08 48.06 21.45 13.61 7.53 5.58 0.00 11.42 173.96 147.58 75.56 BBC|186 105.06 139.80 79.56 107.29 72.60 68.93 30.31 75.54 120.55 68.01 28.38 16.66 8.04 1.89 0.00 139.99 136.82 114.64 79.61 BBC|205 74.48 129.04 64.87 120.37 34.58 51.98 44.27 64.91 72.07 51.39 25.08 24.24 10.84 23.96 0.00 13.80 126.97 121.68 100.63 * NB, Nutrient Broth;* NB, TSB, Nutrient Trypto-casein Broth; Soy Broth; TSB, LB, LBTrypto Broth (Miller);-casein BHI, SoyBrain-Heart Broth; Broth. LB, LB (Miller) Broth; BHI, Brain-Heart Infusion Broth; NaCL, sodium chloride.

Supplementary Table 8. Descriptive statistics for NAUCs biomass production, and effect of environmental factors (temperature, salinity, and pH) in strain growth.

NB* TSB LB BHI 20 °C 28 °C 37 °C 0% NaCl 1% NaCl 3% NaCl 5% NaCl 7% NaCl 9% NaCl 10% NaCl pH 3 pH 5 pH 7 pH 9 pH 11 Maximum 105.37 191.31 98.75 141.41 99.18 94.00 55.18 99.87 120.55 102.50 105.61 61.93 44.82 23.96 37.97 153.09 194.18 148.17 106.23 Minimum 021.35 50.21 16.35 029.89 18.12 25.00 17.59 36.93 025.42 011.97 010.49 02.86 00.62 00.42 00.00 005.88 048.84 036.08 010.30 Mean 075.60 117.08 65.47 087.05 53.50 57.09 39.36 71.03 066.12 051.61 030.61 20.66 12.59 09.48 01.76 065.81 115.98 088.96 056.78 SD 021.50 36.18 17.86 027.85 20.82 20.08 9.49 17.53 023.01 024.06 020.67 12.55 10.75 08.35 07.55 054.70 036.82 029.38 026.93 CV 28% 31% 27% 32% 39% 35% 24% 25% 35% 47% 68% 61% 85% 88% 429% 83% 32% 33% 47% * NB, Nutrient* Broth;NB, Nutrient TSB, Broth; Trypto TSB, Trypto-casein-casein Soy Broth; Broth; LB, LB LB, Broth LB (Miller); (Miller) BHI, Brain-Heart Broth; BHI, Broth; BrainSD, standard-Heart deviation; Infusion CV, coeficcient Broth ;of NaCl, variation. sodium chloride; SD, standard deviation; CV, coefficient of variation.

106 Appendix F Utilization of sugars, sugar alcohols and esters

Supplementary Table 9. Utilization of sugars as sole carbon source for each strain. Aldoses Ketoses Disaccharides Ara* Gal Glu Man Rib Xyl Fru Sor Cello Lac Cluster I# BBC|003 ++ ++ ++ + ++ ++ ++ ± ++ + BBC|008 ++ ++ ++ + ++ ++ + - ++ + BBC|027 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ Cluster II BBC|005 + ± + + ++ + + ± ± ± BBC|032 ± + + + ++ + ± ± ++ ± BBC|161 ± ± + ++ ± ± + ± ± ± BBC|169 - ± + + - ± + NG ± ± BBC|170 ± ± ± ± ± ± ± ± + + BBC|177 - ± ± + - NG + - ++ + BBC|184 + + + + + ± + NG + + BBC|186 ± + ++ + ± ± + - ++ ++ BBC|205 ± ± + + - ± + - ± + Cluster III BBC|016 + + + ± + + + ++ + + BBC|020 ++ ± + + ++ ++ ++ ++ ± + BBC|024 + + + + ++ ++ ++ ++ + ± BBC|031 ++ + + ++ ++ + + ± ± ± BBC|056 + ± + + ± ± + - ± ± BBC|077 NG + ± + ± ± NG NG + ± BBC|148 NG ± ± ± - - ± NG ± - Cluster IV BBC|021 + + + ++ ++ ± ± ± ± ± BBC|029 ++ + ++ ++ ++ ++ ++ + ± ± BBC|034 + + + + + ± + NG ++ ++ BBC|118 + + + ++ ++ ++ ++ - ++ + Cluster V BBC|128 NG ± + ± NG NG ++ NG ++ + Cluster VI BBC|035 + - ++ + + + ++ ± ++ ++ * Ara, arabinose; Gal, galactose; Glu, glucose; Rib, ribose; Xyl, xylose; Fru, fructose; Sor, sorbose; Cello, cellobiose; Lac, lactose; NG, no growth (NAUC = 0.00); -, negligible growth; ±, weak growth; +, positive growth; ++, strong positive growth. # Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

Supplementary Table 10. Utilization of sugar alcohols as sole carbon source for each strain. AraL* GalL GlyL InoL ManL RibL SorL XylL Cluster I# BBC|003 - NG + + ++ ± + - BBC|008 - ± + + ++ - + NG BBC|027 ± ± ++ ++ ++ ++ ++ ± Cluster II BBC|005 ± - + NG ± - + NG BBC|032 + ± - ± + + + - BBC|161 ++ ± ++ + ++ + ± ± BBC|169 ± ± + ± + ± ± ± BBC|170 ± ± ± ± + ± ± ± BBC|177 ± - + NG + ± ± - BBC|184 ± + + - ++ - + ± BBC|186 ± ± ++ ± ± ± + ± BBC|205 - ± + - + ± + NG Cluster III BBC|016 NG NG + + ++ - + NG BBC|020 NG NG ++ ++ ++ - ± NG BBC|024 NG - + NG ± - ± NG BBC|031 - NG ± NG ± NG NG NG BBC|056 - - - ± NG + + - BBC|077 - - ± ± + ± - ± BBC|148 NG NG ± - + - ± - Cluster IV BBC|021 - ± + ± ± ± ± NG BBC|029 - - ± ± + - + NG BBC|034 - NG + NG + - ± NG BBC|118 NG - ++ - ++ ± ± - Cluster V BBC|128 ± - + - + ± ± ± Cluster VI BBC|035 NG NG + + + - + ± * AraL, arabitol; GalL, galactitol; GlyL, glycerol; InoL, myo-inositol; ManL, mannitol; RibL, ribitol; SorL, sorbitol; XylL, xylitol; NG, no growth (NAUC = 0.00); -, negligible growth; ±, weak growth; +, positive growth; ++, strong positive growth. # Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

107 Supplementary Table 11. Utilization of esters as sole carbon source for each strain. Methyl Tween 20 Tween 40 Tween 60 Tween 80 Butyrate* Cluster I# BBC|003 - ± ± ± ± BBC|008 NG ± ± ± + BBC|027 NG + ± ± + Cluster II BBC|005 ± + + NG + BBC|032 - ± + + + BBC|161 ± ++ ++ ++ ++ BBC|169 - + + + + BBC|170 ± + + + + BBC|177 NG ± ± ± ± BBC|184 NG + + + + BBC|186 + ++ ++ ++ ++ BBC|205 - ± + + + Cluster III BBC|016 NG - - ± ± BBC|020 ± + + + + BBC|024 ± ± ± ± ± BBC|031 - ± ± ± + BBC|056 NG + ± + ± BBC|077 NG ± ± ± ± BBC|148 NG + + + + Cluster IV BBC|021 + ++ ++ ++ ++ BBC|029 ± ++ ++ ++ ++ BBC|034 NG + + + + BBC|118 - ++ ++ ++ ++ Cluster V BBC|128 NG ± ± + ± Cluster VI BBC|035 NG ± ± ± - * NG, no growth (NAUC = 0.00); -, negligible growth; ±, weak growth; +, positive growth; ++, strong positive growth. # Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

108 Appendix G Enzyme producers

Supplementary Table 12. Best carboxyl ester hydrolases producers in the working set. TRIB MB* T20 T40 T60 T80 Producers# Best Producers Cluster I† BBC|003 ++ - + + + + ++ BBC|008 - - + + + + + BBC|027 + - + + + + + Cluster II BBC|005 - - + + + + + BBC|032 ++ - + + + + ++ BBC|161 ++ + (5) ++ (4) ++ (5) ++ (4) ++ (4) ++ ● BBC|169 ++ - + + + + ++ BBC|170 ++ + (4) + + + + ++ ● BBC|177 ++ - + + + + ++ BBC|184 - - + + + + + BBC|186 - + (2) ++ (1) ++ (4) ++ (3) ++ (6) ++ ● BBC|205 ++ - + + + + ++ Cluster III BBC|016 ++ - - - + + ++ BBC|020 + + + + + + + BBC|024 ++ + + + + + ++ BBC|031 + - + + + + + BBC|056 ++ - + + + + ++ BBC|077 ++ - + + + + ++ BBC|148 ++ - + + + + ++ Cluster IV BBC|021 + + (1) ++ (3) ++ (1) ++ (2) ++ (1) ++ ● BBC|029 + + (3) ++ (5) ++ (3) ++ (5) ++ (3) ++ ● BBC|034 ++ - + + + + ++ BBC|118 ++ - ++ (2) ++ (2) ++ (1) ++ (2) ++ ● Cluster V BBC|128 - - + + + + + Cluster VI BBC|035 ++ - + + + - ++ * TRIB, tributyrin test; MB, methyl butyrate; T20, Tween 20; T40, Tween 40; T60, Tween 60; T80, Tween 80 (results from the sole carbon assay). Results divided into three classes: -, negative or negligible; +, positive; ++, strong positive. Rank of the best five producers for each substrate in the sole carbon source in brackets. # CEHs producers were classified with the best result for each strain, regardless of the substrate. † Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

Supplementary Table 13. Best β-glucosidase producers in the working set. Esculin* Cellobiose Producers# Best Producers† Cluster I‡ BBC|003 ++ ++ ++ ● BBC|008 ++ ++ ++ ● BBC|027 ++ ++ ++ ● Cluster II BBC|005 + + + BBC|032 - ++ ++ BBC|161 - + + BBC|169 - + + BBC|170 ++ + ++ BBC|177 ++ ++ ++ ● BBC|184 - + + BBC|186 - ++ ++ BBC|205 - + + Cluster III BBC|016 ++ + ++ BBC|020 - + + BBC|024 ++ + ++ BBC|031 - + + BBC|056 + + + BBC|077 - + + BBC|148 + + + Cluster IV BBC|021 - + + BBC|029 + + + BBC|034 ++ ++ ++ ● BBC|118 ++ ++ ++ ● Cluster V BBC|128 + ++ ++ Cluster VI BBC|035 ++ ++ ++ ● * Esculin test; growth in cellobiose as sole carbon source. Results divided into three classes: -, negative or negligible; +, positive; ++, strong positive. # β-glucosidase producers were classified with the best result for each strain, regardless of the test. † Strains with strong positive results in both tests (selected as best producers). ‡ Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

109 Supplementary Table 14. Best pectin degrading enzymes producers in the working set. YNBA+PEC* TSA+PEC Producers# Best Producers† Cluster I‡ BBC|003 + - + BBC|008 - ++ ++ BBC|027 - + + Cluster II BBC|005 + + + ● BBC|032 + - + BBC|161 + + + ● BBC|169 + + + ● BBC|170 + ++ ++ ● BBC|177 - - - BBC|184 - - - BBC|186 - - - BBC|205 - + + Cluster III BBC|016 + + + ● BBC|020 - + + BBC|024 - - - BBC|031 - - - BBC|056 - + + BBC|077 - - - BBC|148 - - - Cluster IV BBC|021 - - - BBC|029 + + + ● BBC|034 - - - BBC|118 - - - Cluster V BBC|128 + - + Cluster VI BBC|035 - + + * YNBA+PEC, yeast nitrogen base without amino acids agar supplemented with 1% w/v of pectin; TSA+PEC, trypto-casein soy agar supplemented with 1% w/v of pectin. Results divided into three classes: -, negative; +, positive; ++, strong positive. # Pectin degrading enzyme producers were classified with the best result for each strain, regardless of the test. † Strains with positive results in both tests (selected as best producers). ‡ Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

Supplementary Table 15. Best peptidase producers in the working set. Casein* Gelatin Producers# Best Producers† Cluster I‡ BBC|003 - - - BBC|008 - - - BBC|027 - - - Cluster II BBC|005 - - - BBC|032 - - - BBC|161 - - - BBC|169 - - - BBC|170 + + + ● BBC|177 - - - BBC|184 - - - BBC|186 - - - BBC|205 - - - Cluster III BBC|016 - - - BBC|020 - - - BBC|024 - - - BBC|031 - - - BBC|056 - - - BBC|077 - - - BBC|148 + - + Cluster IV BBC|021 + - + BBC|029 - - - BBC|034 - + + BBC|118 + + + ● Cluster V BBC|128 - - - Cluster VI BBC|035 + + + ● * Skim milk test; gelatin liquefaction test. Results divided into two classes: -, negative; +, positive. # Peptidase producers were classified with the best result for each strain, regardless of the test. † Strains with positive results in both tests (selected as best producers). ‡ Strains are grouped in clusters according to the dendrogram generated after PCA, presented in Figure 4.

110 Appendix H Phylogenetic trees of the strains in the working set

A Staphylococcus cohnii subsp. urealyticus ATCC 49330T AB009936 B BBC|128 Staphylococcus nepalensis CW1T AJ517414 Staphylococcus edaphicus CCM 8730T KY315825 T Staphylococcus cohnii subsp. cohnii ATCC 29974T D83361 Staphylococcus saprophyticus subsp. bovis GTC 843 AB233327 T BBC|177 Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 AP008934 BBC|169 Staphylococcus succinus subsp. casei SB72T AJ320272 BBC|077 Staphylococcus succinus subsp. succinus AMG-D1T AF004220 BBC|205 Staphylococcus xylosus ATCC 29971T D83374 BBC|148 Staphylococcus cohnii subsp. cohnii ATCC 29974T D83361 BBC|056 T Staphylococcus cohnii subsp. urealyticus ATCC 49330 AB009936 T Staphylococcus argenteus MSHR1132 FR821777 Staphylococcus nepalensis CW1T AJ517414 T Staphylococcus aureus subsp. anaerobius ATCC 35844 D83355 Staphylococcus equorum subsp. equorum ATCC 43958T AB009939 T Staphylococcus aureus subsp. aureus ATCC 12600 L36472 Staphylococcus equorum subsp. linens RP29T AF527483 T Staphylococcus haemolyticus CCM 2737 X66100 Staphylococcus aureus subsp. anaerobius ATCC 35844T D83355 T Staphylococcus simiae CCM 7213 AY727530 Staphylococcus aureus subsp. aureus ATCC 12600T L36472 Staphylococcus schweitzeri FSA084T CCEL01000025

C D BBC|031 BBC|016 BBC|118 T Comamonas jiangduensis YW1 JQ941713 Aeromonas veronii ATCC 35624T X60414 Comamonas aquatilis SB30-Chr27-3T KU355878 Aeromonas allosaccharophila CECT 4199T S39232 Aeromonas australiensis 266T HE611955 Comamonas aquatica LMG 2370T AJ430344 Aeromonas fluvialis 717T FJ230078 Comamonas terrigena IMI 359870T AF078772 Aeromonas lacus AE122T HG970953 Aeromonas jandaei ATCC 49568T X60413 Comamonas kerstersii LMG 3475T AJ430347 Aeromonas sanarellii A2-67T FJ230076 T Comamonas phosphati WYH 22-41 JQ246447 Aeromonas dhakensis LMG 19562T AJ508765 T Comamonas terrae A3-3T GQ497244 Aeromonas taiwanensis A2-50 FJ230077 Aeromonas punctata ATCC 15468T X74674 T Comamonas composti YY287 EF015884 Aeromonas enteropelogenes DSM 6394T X71121

E BBC|034 F BBC|003 Enterobacter cancerogenus LMG 2693T Z96078 Aeromonas enteropelogenes DSM 6394T X71121 BBC|008 Aeromonas rivipollensis P2G1T FR775967 Enterobacter ludwigii EN-119T AJ853891 T Aeromonas dhakensis LMG 19562 AJ508765 Enterobacter mori R18-2T EU721605 Enterobacter bugandensis EB-247T NZ_LT992502 Aeromonas taiwanensis A2-50T FJ230077 Enterobacter xiangfangensis 10-17T HF679035 T subsp. hydrophila ATCC 7966 CP000462 Enterobacter hormaechei subsp. hormaechei CIP 103441T AJ508302 Aeromonas hydrophila subsp. ranae CIP 107985T AM262151 Enterobacter asburiae ATCC 35953T AB004744 T T Enterobacter hormaechei subsp. oharae EN-314 AJ853889 Aeromonas punctata ATCC 15468 X74674 T T Enterobacter hormaechei subsp. steigerwaltii EN-562 AJ853890 Aeromonas sanarellii A2-67 FJ230076 T Enterobacter muelleri JM-458 KP345900 T Aeromonas media ATCC 33907 X60410 Enterobacter tabaci YIM Hb-3T KP990658 Enterobacter soli LF7T GU814270 Aeromonas allosaccharophila CECT 4199T S39232 Enterobacter kobei CIP 105566T AJ508301

Supplementary Figure 5. Phylogenetic trees for the strains in the working set. For each phylogenetic reconstruction: the evolutionary history was inferred using the neighbor-joining method [290], and based on partial 16S rRNA gene sequence; bootstrap values (1000 replicates) are shown at the branching points [294]; evolutionary distances were computed using the Jukes-Cantor method [295]; each tree is drawn to scale; gaps, and missing data were eliminated. Evolutionary analyses were conducted in MEGA X [289]. T indicates type strain for the species. The alphanumeric sequence corresponds to the GenBank accession number to the 16S rRNA sequence. Strains from the working set and closest species/strains in bold. Bars correspond to nucleotide substitutions per site. A) Relationship of strains BBC|056, BBC|077, BBC|148, BBC|177, and BBC|205 with close species in the genus Saphylococcus. Optimal tree with the sum of branch length = 0.0234. The analysis involved 15 sequences, with a total of 581 positions in the final dataset. B) Relationship of strain BBC|128 with close species in the genus Staphylococcus. Optimal tree with the sum of branch length = 0.0215. The analysis involved 14 sequences, with a total of 875 positions in the final dataset. C) Relationship of strain BBC|031 with close species in the genus Comamonas. Optimal tree with the sum of branch length = 0.0937. The analysis involved 9 sequences, with a total of 914 positions in the final dataset. D) Relationship of strains BBC|016 and BBC|118 with close species in the genus Aeromonas. Optimal tree with the sum of branch length = 0.0187. The analysis involved 13 sequences, with a total of 848 positions in the final dataset. E) Relationship of strain BBC|034 with close species in the genus Aeromonas. Optimal tree with the sum of branch length = 0.0077. The analysis involved 11 sequences, with a total of 647 positions in the final dataset. F) Relationship of strains BBC|003 and BBC|008 with close species in the genus Enterobacter. Optimal tree with the sum of branch length = 0.0358. The analysis involved 15 sequences, with a total of 775 positions in the final dataset. It should be noted that in this tree the three subspecies of Enterobacter hormaechei appear in two separated clusters.

111 T G Citrobacter amalonaticus CECT 863 FR870441 H BBC|184 T Citrobacter farmeri ATCC 51112 AF025371 Citrobacter werkmanii CDC 0876-58T AF025373 T Citrobacter rodentium CDC 1843-73 AF025363 Citrobacter murliniae ATCC 51118T AF025369 T Citrobacter sedlakii CDC 4696-86 AF025364 ATCC 29935T M59291 T LMG 5519 HQ992945 Citrobacter youngae CCUG 30791T RPOI01000045 BBC|020 Citrobacter portucalensis A60T MVFY01000035 Citrobacter gilenii ATCC 51117T AF025367 Citrobacter braakii CDC 80-58T AF025368 Citrobacter pasteurii CIP 55.13T KP057683 Citrobacter europaeus CIP 106467T LT615140 BBC|032 T T Citrobacter gilenii ATCC 51117 AF025367 Citrobacter werkmanii CDC 0876-58 AF025373 T T Citrobacter pasteurii CIP 55.13 KP057683 Citrobacter portucalensis A60 MVFY01000035 T T Citrobacter amalonaticus CECT 863 FR870441 Citrobacter murliniae ATCC 51118 AF025369 T T Citrobacter farmeri ATCC 51112 AF025371 Citrobacter braakii CDC 80-58 AF025368 T T Citrobacter rodentium CDC 1843-73 AF025363 Citrobacter europaeus CIP 106467 LT615140 T Citrobacter freundii ATCC 29935T M59291 Citrobacter koseri LMG 5519 HQ992945 T Citrobacter youngae CCUG 30791T RPOI01000045 Citrobacter sedlakii CDC 4696-86 AF025364

I J BBC|021 BBC|029 Acinetobacter gerneri 9A01T AF509829 Acinetobacter sp. WCHA34 MBPQ02000065 DSM 30007T X81660 T Acinetobacter junii DSM 6964 X81664 Acinetobacter gandensis UG60467T KM206131 Acinetobacter courvalinii ANC 3623T KT997472 T Acinetobacter vivianii NIPH 2168T KT997477 Acinetobacter schindleri LUH 5832 AJ278311 T Acinetobacter modestus NIPH 236 KT997474 Acinetobacter sp. ANC 4218 NEGD01000004 Acinetobacter qingfengensis 2BJ1T JX982123 T Acinetobacter apis HYN18T JX402203 Acinetobacter bouvetii 4B02 AF509827 T Acinetobacter nectaris SAP 763.2 JQ771132 Acinetobacter johnsonii ATCC 17909T Z93440 Acinetobacter boissieri SAP 284.1T JQ771141 Acinetobacter indicus A648T HM047743 Acinetobacter sp. ANC 4169 NEGE01000006 T Acinetobacter variabilis NIPH 2171 KP278590 Acinetobacter sp. ANC 3903 NEGA01000008 Acinetobacter radioresistens DSM 9676T X81666 T Acinetobacter populi PBJ7T KM518626 Acinetobacter calcoaceticus NCCB 22016 AJ888983

K Pseudomonas putida IAM 1236T D84020 L Pseudomonas cichorii LMG 2162T Z76658 T Pseudomonas reidholzensis CCOS 865T LT009707 Pseudomonas donghuensis HYS NR 136501 T T Pseudomonas wadenswilerensis CCOS 864 LT009706 Pseudomonas fulva NRIC 0180 AB060136 Pseudomonas alkylphenolica KL28T CP009048 Pseudomonas parafulva AJ 2129T AB060132 Pseudomonas vranovensis CCM 7279T AY970951 T Pseudomonas benzenivorans DSM 8628 FM208263 Pseudomonas asplenii ATCC 23835T AB021397 T Pseudomonas kuykendallii H2 JF749828 Pseudomonas fuscovaginae ICMP 5940T FJ483519 Pseudomonas cremoricolorata IAM 1541T AB060137 Pseudomonas japonica IAM 15071T AB126621 T BBC|005 Pseudomonas mosselii CIP 105259 AF072688 T Pseudomonas kunmingensis HL22-2T JQ246444 Pseudomonas guariconensis PCAVU11 HF674459 Pseudomonas entomophila L48T AY907566 Pseudomonas zhaodongensis nEAU-ST5-21T JQ762275 BBC|161 Pseudomonas xanthomarina KMM 1447T AB176954 Pseudomonas monteilii CIP 104883T AF064458 T Pseudomonas balearica DSM 6083 U26418 Pseudomonas plecoglossicida FPC 951T AB009457 Pseudomonas stutzeri ATCC 17588T AF094748 Pseudomonas taiwanensis BCRC 17751T EU103629

M N T Pseudomonas panipatensis Esp-1T EF424401 Stenotrophomonas ginsengisoli DCY01 DQ109037 Stenotrophomonas koreensis TR6-01T AB166885 Pseudomonas jinjuensis Pss 26T AF468448 BBC|170 T Pseudomonas citronellolis DSM 50332T Z76659 Stenotrophomonas rhizophila e-p10 AJ293463 Stenotrophomonas bentonitica BII-R7T LT622838 Pseudomonas delhiensis RLD-1T DQ339153 Stenotrophomonas maltophilia IAM 12423T AB294553 BBC|186 Stenotrophomonas pavanii ICB 89T FJ748683

T Pseudomonas knackmussii B13T AF039489 Stenotrophomonas chelatiphaga LPM-5 EU573216 Stenotrophomonas indicatrix WS40T KJ452162 Pseudomonas nitritireducens WZBFD3-5A2T HM246143 Stenotrophomonas lactitubi M15T LT222224 T Pseudomonas nitroreducens IAM 1439 AM088473 Stenotrophomonas tumulicola T5916-2-1bT LC066089

Supplementary Figure 5. Phylogenetic trees for the strains in the working set (cont.). G) Relationship of strains BBC|020 and BBC|032 with the species in the genus Citrobacter. Optimal tree with the sum of branch length = 0.0633. The analysis involved 16 sequences, with a total of 718 positions in the final dataset. H) Relationship of strain BBC|184 with the species in the genus Citrobacter. Optimal tree with the sum of branch length = 0.0670. The analysis involved 15 sequences, with a total of 955 positions in the final dataset. I) Relationship of strain BBC|021 with close species in the genus Acinetobacter. Optimal tree with the sum of branch length = 0.1767. The analysis involved 15 sequences, with a total of 868 positions in the final dataset. J) Relationship of strain BBC|029 with close strains in the genus Acinetobacter. Optimal tree with the sum of branch length = 0.0796. The analysis involved 10 nucleotide sequences, with a total of 953 positions in the final dataset. K) Relationship of strain BBC|005 with close species in the genus Pseudomonas. Optimal tree with the sum of branch length = 0.0851. The analysis involved 13 sequences, with a total of 576 positions in the final dataset. L) Relationship of strain BBC|161 with close species in the genus Pseudomonas. Optimal tree with the sum of branch length = 0.0419. The analysis involved 15 sequences, with a total of 858 positions in the final dataset. M) Relationship of strain BBC|186 and close species in the genus Pseudomonas. Optimal tree with the sum of branch length = 0.0251. The analysis involved 8 sequences, with a total of 849 positions in the final dataset. N) Relationship of strain BBC|170 with close species in the genus Stenotrophomonas. Optimal tree with the sum of branch length = 0.0502. The analysis involved 11 sequences, with a total of 834 positions in the final dataset.

112