<<

Exploring Polar Microbiomes as Source of Bioactive Molecules

Adriana Isabel Correia Rego Mestrado em Biologia Celular e Molecular

Departamento de Biologia da Faculdade de Ciências da

Universidade do Porto

Dissertação de Mestrado

2016/2017

Orientador: Pedro Leão, Investigador FCT, Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR)

Co-orientador: Catarina Magalhães, Investigador FCT, Centro Interdisciplinar de Investigação Marinha e Ambiental (CIIMAR) e Professora Auxiliar Convidada FCUP

FCUP ii Exploring Polar Microbiomes as Source of Bioactive Molecules

Todas as correções determinadas pelo júri, e só essas, foram efetuadas. O Presidente do Júri, Porto, ______/______/______

FCUP iii Exploring Polar Microbiomes as Source of Bioactive Molecules

Agradecimentos

Antes de mais, quero agradecer aos meus pais, sem o apoio dos quais nada disto se teria tornado possível. Agradecer aos meus orientadores, Pedro Leão e Catarina Magalhães, pela confiança depositada, por todos os conhecimentos partilhados e tempo despendido e pela oportunidade de fazer um trabalho de investigação desafiante e concretizador. Quero também agradecer à Teresa Martins, António Sousa e Inês Ribeiro, companheiros desta jornada, por toda a partilha de bons momentos e entreajuda. Em especial à Teresa pela amizade e ajuda incansável na química, ao António pela ajuda e paciência na análise bioinformática e à Inês, pela companhia e ajuda na realização dos ensaios. Agradecer à equipa do laboratório Ecobiotec, em especial à Fátima Carvalho e à Mafalda Baptista pela ajuda nos isolamentos bacterianos. Ao grupo de Bioinformática, à Maria Paola e ao António, por todos os ensinamentos e momentos de boa disposição. A toda a equipa do BBE, particularmente ao João, à Raquel e ao Vítor pelo fornecimento das estirpes da coleção de culturas e DNAs. Ao Tiago por toda a ajuda fornecida na realização dos ensaios de citotoxicidade. Também ao Jorge, pela companhia nos almoços e por ter sempre uma palavra de incentivo. Por fim, agradecer ao Alfredo, pelo apoio incondicional.

Agradeço também ao NORTE2020, Fundo Europeu de Desenvolvimento Regional (FEDER), programas estruturados R&D&I MarInfo - NORTE-01-0145-FEDER-000031 e R&D&I INNOVMAR - NORTE-01-0145-FEDER-000035, NOVELMAR e ao Programa Polar Português (PROPOLAR) pelo financiamento.

FCUP iv Exploring Polar Microbiomes as Source of Bioactive Molecules

Resumo

Com o aumento da incidência de estirpes bacterianas multirresistentes aos antibióticos e doenças como o cancro, existe uma necessidade imperativa de encontrar novos potenciais fármacos. Os produtos naturais de origem microbiana estão na base de uma séride de fármacos de grande importância, utilizados atualmente para combater uma grande variedade de enfermidades. As estratégias adotadas mais recentemente para a pesquisa de novas moléculas assentam na análise genética de clusters de genes biossintéticos ou genomas, assim como na pesquisa de genes biossintéticos (em particular do tipo PKS e NRPS), em combinação com uma pesquisa guiada pela bioatividade e estrutura. O estudo de microorganismos que habitam ambientes extremos é igualmente uma estratégia promissora, dado que é expectável que grande parte dos microorganismos sejam ainda desconhecidos e possuam estratégias adaptativas únicas ao seu habitat, incluindo a produção de novas moléculas.

Neste trabalho, conjugamos ambas as estratégias, pesquisa de genes biosintéticos e testes de bioatividade para avaliar o potencial bioativo de um microbioma polar, os Vales Secos da Antártida. Dois objetivos principais foram definidos: (1) O desenho de primers capazes de amplificar os domínios biosintéticos, KS e A, dos genes PKS e NRPS, respetivamente. (2) A análise da diversidade microbiana de dados de pirosequenciação, assim como o isolamento de microorganismos de amostra ambientais da Antártida, e uma triagem do potencial bioativo dos isolados através da realização de bioensaios.

Dois novos pares de primers foram desenvolvidos neste trabalho, capazes de eficientemente amplificar os domínios biossintéticos (KS e A) de estirpes bacterianas pertencentes a pelo menos, os filos de Actinobactéria, Cianobactéria, Proteobactéria e Planctomicetes, úteis para triagem de isolados bacterianos em grande escala e estudos de bioprospecção de metagenómica. Caso validações futuras comprovem a sua eficiência, tais primers poderão vir a tornar-se um novo padrão para bioprospecção metagenómica com base em PCR.

As amostras ambientais da Antártida revelaram uma grande diversidade de filos quimicamente prolíficos, em particular de actinobactérias e cianobactérias. Foram obtidos isolados bacterianos dos filos , e Proteobacteria e ainda espécies de fungos. Verificou-se que muitos dos isolados demonstravam FCUP v Exploring Polar Microbiomes as Source of Bioactive Molecules bioatividade em diferentes ensaios, em particular um extrato fraccionado com propriedades antimicrobianas produzido por um fungo do género . Duas potenciais novas espécies de dois géneros diferentes são apresentadas e que têm também capacidade genética de produção de metabolitos secundários.

Palavras-chave Produtos naturais, metabolismo secundário, primers, diversidade microbiana, PKS, NRPS, bioatividade

FCUP vi Exploring Polar Microbiomes as Source of Bioactive Molecules

Abstract

With the increase in incidence of antibiotic multi-resistant bacterial and diseases as cancer, there is an urgent necessity to find new potential drugs. Microbial natural products have yielded a variety of currently used pharmaceutically important compounds. Presently the strategies adopted to find new molecules rely on the genetic analysis of the biosynthetic gene clusters/genomes as well as gene mining (for PKS and NRPS genes), combined with bioactivity- and structure-guided discovery. Furthermore, the study of microorganisms inhabiting extreme environments is also pointed as an auspicious strategy, as it is expected that a large fraction of their microbiota is still unknown and that these organisms possess unique adaptations to their habitats, including the production of novel molecules. Here, we combine both biosynthetic gene mining and bioactivity-guided strategies to survey the bioactive potential of a polar microbiome, the McMurdo Dry Valleys, in Antarctica. To achieve this, two main objectives were pursued: (1) the design of a primer pair to amplify the KS and A domain of PKS and NRPS genes from a wide range of chemically-prolific bacterial phyla and, (2) biodiversity analysis of pyrosequencing data, isolation and growth of microorganisms from Antarctic environmental samples and screening of the bioactive potential of the isolates through in vitro assays. Improved primer pairs, able to efficiently amplify the biosynthetic domains from Actinobacteria, Cyanobacteria, Proteobacteria and Planctomycetes bacterial strains, at least, were obtained, useful for large-scale screening of bacterial isolates and bioprospection in metagenomic studies. If further validation confirms the efficiency, our primers may become a new standard for PCR-based metagenomics bioprospection. Antarctic samples revealed to harbour a large diversity of prolific phyla, mainly Actinobacteria and Cyanobacteria. Bacterial strains from Actinobacteria, Firmicutes, and Proteobacteria phyla, and Fungi strains were isolated. Bioactivity was reported for the first time for several strains, and a potential antimicrobial compound from a fungi Penicillium is described. Furthermore, two potential novel species from two genera are reported and according to the biosynthetic domain mining, are worth exploring.

Keywords natural products, secondary metabolism, primers, microbial diversity, PKS, NRPS, bioactivity FCUP vii Exploring Polar Microbiomes as Source of Bioactive Molecules

Table of contents

Agradecimentos ...... iii Resumo ...... iv Abstract ...... vi List of Figures ...... x List of Tables ...... xii List of presentations ...... xiv List of abreviations ...... xv Polyunsaturated fatty acid ...... xvi I – Introduction ...... 1 1 - A new era in Natural products - Gene and Genome mining for discovery of (novel) molecules ...... 4 1.1 - Culture-dependent approach ...... 4 1.2 - Culture independent approach – Metagenomics ...... 7 Chapter 1 - Design of primer pairs targeting the biosynthetic domains of PKS and NRPS genes in ...... 10 I – Background ...... 11 II – Goals ...... 11 III – Materials and Methods ...... 12 1 - Design of oligonucleotide primers ...... 12 1.1 -Sequence retrieval for KS Domain of Type I PKS genes and AD domain for NRPS genes ...... 12 1.2 - Multiple-sequence alignment and Phylogenetic Analysis ...... 13 1.3 - In silico analysis of primers reliability ...... 13 1.5- Optimization of the PCR Amplification protocol ...... 14 1.5.1 – Genomic DNA extraction and quantification ...... 14 1.5.2. – Optimization of PCR reactions: reagent and thermal conditions ...... 15 1.5.3 – Comparison of amplification results for protocols using designed primers or literature primers ...... 16 1.6 Sequencing of PCR products – Test and Phylogenetic analysis ...... 16 1.6.1 – Phylogenetic and NaPDoS analysis ...... 17 IV – Results ...... 18 V - Discussion ...... 26 Chapter 2 – Biodiversity and bioactive potential of the McMurdo Dry Valleys, Antarctica ...... 27 I – Background...... 28 II – Goals ...... 30 III – Materials and Methods ...... 31 FCUP viii Exploring Polar Microbiomes as Source of Bioactive Molecules

1–Biodiversity of a Soil Transect and of a Rock with endolithic colonization in Victoria Valley, Victoria Land, Antarctica ...... 31 1.1 - Sample collection ...... 31 1.2 - eDNA Extraction and 16S rRNA gene sequencing ...... 32 1.2.1- QIIME Analysis of the 16S rRNA gene ...... 32 1.3 - Prediction of the microbiome metabolic capacity using PICRUSt ...... 33 2- Isolation of Microorganisms from a Soil Transect and endolithic sample from Victoria Valley ...... 33 2.1 – Culture strategies –Soil samples T5 and T6 ...... 33 2.2- Identification of Bacterial and Fungi Isolates through 16S rRNA and ITS gene amplification and Phylogenetic analysis ...... 35 2.2.1 - Identification of bacteria and Fungl isolates using FTA Indicating Micro cards (WhatmanT) ...... 35 2.2.2 – Identification of bacterial isolates through 16S rRNA gene amplification .. 35 2.2.2.1 – DNA extraction ...... 35 2.2.2.2 – PCR Amplification of the 16S rRNA gene ...... 36 2.3– Phylogenetic analysis ...... 36 3- Screening by PCR of PKS and NRPS genes in bacterial isolates ...... 37 4 - Preparation of organic extracts for Bioactivity-Guided Isolation of Bioactive Molecules ...... 38 4.1 – Organic extraction for Bioactivity Screenings ...... 38 4.2 - Organic extraction (methanol and fractionation (VLC) from the Penicillium citrinum strain ...... 39 4.2.1 – Organic extraction ...... 39 4.2.2 – Fractionation of the organic extract ...... 39 4.2.2.1 - Flash-chromatography of fraction 31 B ...... 40 5 - Bioassays ...... 41 5.1 - Antimicrobial screening susceptibility assay ...... 41 5.2 - MTT Assay ...... 42 IV-Results ...... 43 1 - Biodiversity of a soil transect and of a rock with endolithic colonization in Victoria Valley, Victoria Land, Antarctica ...... 43 1.1 - Alpha-diversity ...... 43 1.2 – Beta-diversity ...... 44 1.3. – Taxonomic composition ...... 45 1.4 – Predicted Functional profile from 16S rRNA gene ...... 47 2- Biodiversity of culturable strains from the McMurdo Dry Valleys, Antarctica ...... 49 2.1 – Isolation, identification and phylogenetic analysis of obtained Isolates ...... 49 2.1.1– Firmicutes strains ...... 49 FCUP ix Exploring Polar Microbiomes as Source of Bioactive Molecules

2.1.2– Actinobacterial strains ...... 51 2.1.3 – Proteobacteria Isolates ...... 52 2.1.4– Fungi Isolates ...... 54 3– Bioactive potential of Isolated Microorganisms ...... 56 3.1– PCR Screening of bacterial isolates: PKS and NRPS genes ...... 56 3.2 – Bioassay Screening ...... 58 3.2.1 – Antimicrobial Assay ...... 58 3.2.2 – Cytotoxic Assay ...... 61 V-Discussion ...... 68 1 - Biodiversity and Functional Profile of Endolithic and Soil Microbiomes from the McMurdo Dry Valleys ...... 68 2 - Culture-dependent Isolation of Actinobacteria from McMurdo Dry Valleys...... 70 2 - Bioactive potential from McMurdo Dry Valleys Microbial Isolates ...... 72 VI –General Conclusion ...... 75 VII - References ...... 76 VIII – Supplementary Information ...... 85

FCUP x Exploring Polar Microbiomes as Source of Bioactive Molecules

List of Figures

Figure 1 – Phylogenetic tree of the KS domain sequences collected for primer design ...... 19 Figure 2 - Phylogenetic tree of the AD domain sequences collected for primer design ...... 20 Figure 3 - PCR amplification of KS and AD domain from cyanobacterial gDNA...... 21 Figure 4 - PCR amplification of KS and AD domain from gDNA of Planctomycetes and Streptomyces strains...... 22 Figure 5 – Eletrophoresis gel of PCR products of KS and AD domains amplification using the optimized conditions...... 23 Figure 6 – Phylogenetic tree encompassing the KS domain sequences used for primer design ...... 24 Figure 7 –Structures of molecules produced by Antarctic Microorganisms, presented on Table 5...... 29 Figure 8 – Location of sampling points in Victoria Valley (marked in red)...... 31 Figure 9 –(A) Fractionation apparatus (B) – Filtration of fraction though cotton...... 40 Figure 10 – Rarefaction curves for alpha-diversity metrics. (A)- chao1; (B)- Phylogenetic diversity; ...... 44 Figure 11 –PcoA plots using the unweighted (A) and weighted (B) UniFrac metrics. .. 45 Figure 12 - (A) Bar chart of frequency of phyla-affiliated OTUs per sampling point summary...... 46 Figure 13 – Microhotographs of Paeniporosrcina sp. isolates...... 49 Figure 14 - Phylogenetic tree of the 16S rRNA nucleotide sequences of the obtained isolates (2F, 2H, 13F, 13G, 16D, 16 E, 17, 34, 36, 39, 47) from Firmicutes phylum and the closest matches at NCBI 16S database...... 50 Figure 15 -Photographs of Actinobacterial strains isolated from soil sample T5...... 51 Figure 16 - Phylogenetic tree of the 16S rRNA nucleotide sequences of the obtained isolates from Actinobacteria phylum and the closest matches at NCBI 16S database. 53 Figure 17 - Photographs of Fungi strains isolated from soil sample T5 and T6...... 54 Figure 18 - Phylogenetic tree of the ITS and D1/D2 rDNA nucleotide sequences of the obtained Fungi isolates and the closest matches at NCBI nucleotide collection...... 55 Figure 19 - PCR amplification of KS using primer pair degK2F/deK2R and DKF/DK.. 56 Figure 20 – PCR amplification of AD domain using primer pair A3F/A7R ...... 57 Figure 21 – PCR amplification of KS and AD domain using primer pairs degK2F/degK2R and A3F/A7R, respectively...... 57 Figure 22 - Photographic record of inhibition halos...... 59 Figure 23 - Photographic record of inhibition halos...... 60 Figure 24 - Photographic record of inhibition halos of the subfractions tested...... 61 Figure 25 - Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24h and 48h of exposure to organic extracts of Actinobacteria isolates. ……….61 Figure 26 - Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma) ...... 62 Figure 27 - Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24h and 48h of exposure to VLC fractions of Penicullium citrinum strain 31...... 63 FCUP xi Exploring Polar Microbiomes as Source of Bioactive Molecules

Figure 28 -Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma), after 24 and 48h of exposure to VLC fractions of Penicullium citrinum strain 31...... 63 Figure 29- Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24h and 48h of exposure to VLC sub-fractions (fraction B) of Penicullium citrinum strain 31...... 64 Figure 30 - Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma), after 24h and 48h of exposure to VLC sub-fractions (fraction B) of Penicullium citrinum strain 31 ...... 64 Figure S 1 - Eletrophoresis gel of PCR products of KS domain amplification using the optimized conditions. (A) – using primer pair KSF1/KS_v2Rv and (B) using primer pair degK2F/degK2R.

FCUP xii Exploring Polar Microbiomes as Source of Bioactive Molecules

List of Tables

Table 1 - Principal domains present in PKS and NRPS enzymes and their associated functions...... 3 Table 2- Some of the principal available bioinformatic tools and databases for supporting natural products discovery, with relevance for this study...... 6 Table 4 – List of primer pairs published for amplification of AD and KS domains of PKS and NRPS genes, respectively...... 9 Table 5 - Distribution of KS and AD domain sequences selected from the 10 bacterial phyla and group in study...... 13 Table 6 - Example of new bioactive molecules retrieved from Antarctic Microorganisms. The respective structures are depicted below on figure Figure 7...... 29 Table 7 – List of primers used in this work...... 38 Table 8 - Solvent mixtures (eluents) utilized in the fractionation of the crude extract from Penicillium citrinum strain 31 ...... 40 Table 9 - Solvent mixtures used for elution on Flash-Chromatography of fraction 31B...... 41 Table 10 – Picrust KEGG pathways...... 48 Table 11 – Antimicrobial activity of organic extracts tested...... 58 Table 12 - Antimicrobial activity of VLC fractions from the crud extract of Penicillium citrium strain 31...... 59 Table 13 - Antimicrobial activity of flash-chromatography sub-fractions from 31B-1 to 31B-9...... 60 Table 14 - Summary table of obtained isolates and results from PKS/NRPS genes and bioassays screening...... 65

Table S 1- Information of nucleotide sequences of KS domain collected for primer design...... 85 Table S 2 - Information of nucleotide sequences of AD domain collected for primer design...... 88 Table S 3 – Primer pairs designed for KS and AD domain...... 94 Table S 4 – Information of bacterial strains used for primer testing, including antiSMASH genome analysis...... 95 Table S 5 - Detailed information of alpha-diversity measure obtained for each sample in study, including number of sequence, average of Phylogenetic diversity, chao1, observed OTUs metrics...... 96 FCUP xiii Exploring Polar Microbiomes as Source of Bioactive Molecules

Table S 6 - Summary table of taxonomic frequency distributions at level for Cyanobacteria...... 98 Table S 7 - Summary table of taxonomic frequency distributions at Phylum level...... 99

FCUP xiv Exploring Polar Microbiomes as Source of Bioactive Molecules

List of presentations

Rego A, Costa MS, Ramos V, Vasconcelos V, Magalhães C, Leão P (2016). Biodiversity and Chemodiversity of Extreme Polar Bacteria. IJUP 16, Porto, Portugal, February 17-19. Oral communication

Rego A, Costa MS, Ramos V, Hong SG, Vasconcelos V, Magalhães C, Leão P (2016). Extreme Polar Microorganisms: A Biotechnological Approach. VIII Conferência Portuguesa de Ciências Polares, Lisboa, Portugal, October 26-28. Oral communication Rego A, Costa MS, Ramos V, Hong SG, Vasconcelos V, Magalhães C, Leão P (2017). Extreme Polar Microorganisms: Biodiversity and Chemodiversity. IJUP 2017, Porto, Portugal, February 8-10. Oral communication

Rego A, Costa MS, Ramos V, Vasconcelos V, Baptista M, Carvalho F, Magalhães C, Leão P (2017). Biodiversity and Bioactive Potential of Antarctic Microbiomes. Bioinformatics Open Days, Braga, Portugal, February 22-24. Poster presentation

Rego A, Costa MS, Ramos V, Hong SG, Vasconcelos V, Magalhães C, Leão P (2017). Exploring Antarctic Microbiomes as Source of Bioactive Molecules. XIIth SCAR Biology Symposium, Leuven, Belgium, 10-14 July 2017. Oral communication

FCUP xv Exploring Polar Microbiomes as Source of Bioactive Molecules

List of abreviations

16S rRNA 16S ribosomal RNA A(D) Adenylation domain ACP Acyl carrier protein AIA Actinomycete Isolation BGC Biosynthetic gene clusters BLAST Basic Local Alignment Search Tool Bp Base pair(s)

CTAB bromide-polyvinylprrolidone-b-mercaptoethanol DMEM Dubelco's Modified Eagle Medium DMSO Dimethyl sulfoxide DNTP Deoxyribonucleotides triphosphate e.g. Exempli gratia END Endolithic sample ER Enoyl reductase ICTAR International Centre of Terrestrial Antarctic Research: ITS Internal transcribed spacer

KEGG Kyoto Encyclopedia of Genes and Genomes KO KEGG Orthologs KS Ketosynthase LB Luria Broth MB Marine Broth MH Mueller-Hinton MNPS Modified nutrient-poor sediment MTT 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide

NCBI National Center for Biotechnology Information NGS Next-Generation Sequencing NJ Neighbor-joining NP(s) Natural product(s) NPS Nutrient-poor sediment FCUP xvi Exploring Polar Microbiomes as Source of Bioactive Molecules

NRP Nonribossomal peptide NRPS Nonribosomal peptide synthetase

OUT Operational Taxonomic Units

PcoA Principal Coordinate Analysis

PCP Peptidyl Carrier Protein PCR Polymerase chain reaction PD Phylogenetic diversity PK Polyketide

PKS Polyketide synthase

PRISM Products Prediction Informatics for Secondary Metabolomes

PUFA Polyunsaturated fatty acid

PVC Planctomycetes, Verrucomicrobia and Chlamydiae

QIIME Quantitative insights into microbial ecology

RB Round-bottom

RKO Cell lines of colon carcinoma

SH-SY5Y Cell lines of colon carcinoma

TAE Tris-acetate-EDTA

TLC Thin layer chromatography

UV Ultraviolet - VLC Vacuum liquid chromatography

FCUP 1 Exploring Polar Microbiomes as Source of Bioactive Molecules

I – Introduction

Since the discovery of penicillin by Alexander Fleming in 19291, microorganisms are recognized as a rich source of bioactive compounds, and have yielded a plethora of medically important compounds2. Fungi from the Ascomycota phylum and Bacteria, specifically the phyla Actinobacteria, Cyanobacteria, Proteobacteria and Firmicutes are considered prolific producers of natural products3. With the current prevalence of diseases such as cancer and with the increase in incidence of multidrug-resistant bacterial infections4, it is extremely urgent to find and develop new potential drugs. Microbial natural products (NPs) are considered among the most promising and reliable sources of novel drug leads4. Despite their structural and chemical diversity, a large fraction of bioactive microbial natural products belongs to the polyketides (PKs) and nonribossomal peptides (NRPs) biogenetic families, or are their hybrids. Clinically important drugs, including antibiotics (e.g the PKs erythromycin and tetracyclins, and the NRP penicillin G5), anticancer chemotherapeutics (e.g the hybrid NRP-PK didemnin anticancer agent6) andimmunosuppressants7 (cyclosporine, NRP8), fit in these natural products families3. The polyketide synthase (PKS) and nonribossomal peptide synthetase (NRPS) are the families of enzyme complexes responsible for PK and NRP biosynthesis9. These are integral parts of biosynthetic gene clusters (BGCs), i.e. a set of two or more genes physically grouped on the genome that encode the biosynthetic pathway for the production of a natural product10. Furthermore, these enzymes are organized in modules and act by sequential thiotemplated assembly of acyl-CoA (PKS) and amino acid (NRPS) building blocks, catalysing C-C and C-N bond linkages, respectively11,12. A PKS module is essentially composed of a ketosynthase (KS), acyltransferase (AT) and thiolation (T) domain (also referred to as Acyl Carrie Protein – ACP). Additionally, might possess a ketoreductase (KR), dehydratase (DH) and enoyl reductase (ER) domain13 (see Table 1 for description of each domain function). To date, PKSs are classified into three different classes according to the organization of their catalytic domains (non- modular, monomodular and multimodular) and in different subclasses by their mode of action (e.g. iterative, non-iterative, cis-AT or trans-AT)13,14. Type I PKSs employ a multimodular strategy with each module constituted of specific catalytic domains for the recognition, activation and condensation of acyl-CoA, while Type II and Type III PKSs possess each catalytic site separated in different proteins3,15. Type I PKSs are responsible for the biosynthesis of macrolides, polyethers and polyenes whereas type II PKSs are usually associated with the formation of cyclic aromatic and often polycyclic FCUP 2 Exploring Polar Microbiomes as Source of Bioactive Molecules

PK compounds16. Type III PKS, also referred to as chalcone-synthase like PKSs, found predominantly in , are condensing enzymes without a T domain and typically act directly on acyl-CoA substrates17. NRPSs resemble type I PKSs in modular organization, with incorporation and processing per module of each amino acid16. Analogously to PKS, a NRPS module is minimally composed of a condensation (C), adenylation (A) and T domain (also referred to as Peptidyl Carrier Protein – PCP)12, optionally possessing cyclization (Cy), epimerization (E), methyltransferase (MT) and ketoreductase (KR) domains (Table 1) between others9. The enzymatic domains KS, A and C, particularly, possess highly conserved core motifs18. Type I PKSs are frequently associated with NRPS, by co-occurring as part of the same BGC, resulting in hybrid molecules with increased structural diversity when compared to non-hybrid PKs or NRPs. Traditionally, the prospection for novel bioactive compounds is dependent on the cultivation of the microorganisms, followed by a bioactivity screening of their organic (or sometimes aqueous) extracts. A bioactivity-guided isolation through consecutive fractionations and bioassay testing is then typically employed until isolation of the pure compound is achieved. This approach has exposed over 11,000 PK and NRP products19. However, with the advance in DNA sequencing through the Next-Generation Sequencing (NGS) technologies20, an explosion of genome sequences and genome- mining studies has revealed microorganisms as an underexplored source of PK and NRP compounds, with only a small fraction of PKS-NRPS gene clusters (10%) being associated with known products19. This realization triggered a renewed interest and dictated a new era of natural products research21.

FCUP 3 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 1 - Principal domains present in PKS and NRPS enzymes and their associated functions. Adapted from Bachmann and Ravel 200922 and Adamek et al. 201723.

Domain Essential Function AT – Acyltransferase Selection and activation of acyl-CoA substrate through acylation KS – Ketosynthase Catalyses C-C bond formation through Claisen condensation KR – Ketroreductase Reduction of keto groups to hydroxyl groups DH -Dehydratase dehydration of hydroxyl group to α-β-enoyl ER – Enoyl reductase reduction of the enoyl double bond T – Thiolation phosphopantetheinylate acyl carrier protein shared by PKS and NRPS (also commonly referred to as ACP domain for PKSs and PCP domain for NRPSs) TE -Thioesterase Cleavage of mature PK/NRP via macrocylization or release of the full length chain. MT – Methyltransferase Methylation of PK, and N-methylation of NRP C – Condensation Formation of Peptide bond A – Adenylation Amino acid activation via intermediary adenylation E – Epimerization Epimerizing amino acids, flipping stereo-chemistry Re – Reductase Reduction (usually terminal) of mature PK/NRP resulting in aldehyde Cy – Cyclization Formation of a peptide bond and subsequent amino acid cyclization

FCUP 4 Exploring Polar Microbiomes as Source of Bioactive Molecules

1 - A new era in Natural products - Gene and Genome mining for discovery of (novel) molecules

1.1 - Culture-dependent approach In spite of the usefulness of NPs, an increasing disinterest by the pharmaceutical companies for NP drug discovery programmes was verified in the last decades of the previous century in part due to the repeated isolation of known compounds 24, to the laborious, time consuming and expensive procedures necessary to isolate them, as well as the difficulty to obtain synthetic analogues 25. On the other side, the emergence of combinatorial chemistry promised (at the time) plenty new compounds to be used for bioactivity screening tests 26. The advance of new sequencing technologies in the beginning of 21st century 20 dictated a new golden era of natural products discovery 25. The exponential increase in number of genome sequences available at databases made possible the development of different bioinformatic tools (Table 2), providing a more targeted discovery, overcoming most of the disadvantages refereed above. This development was the basis for an explosion of genome-mining studies for supporting natural products discovery 27,28. The so-called “genome mining” approach is useful to identify gene clusters potentially responsible for the synthesis of novel compounds. By perceiving the composition and regulation of the gene clusters, a targeted isolation and characterization of new PKs and NRPs molecules can be followed, reducing prospection time and costs. Besides, the information can be used to help to activate “silent” gene clusters, as well as to optimize the conditions for heterologous expression23, as exemplified by the production of the antibiotic pantocin B in E.coli29. This approach revealed unexpected enzymatic diversity and extended the knowledge concerning the distribution of these enzymes through the three domains of life9. Of particular relevance, a positive correlation between genome size and the fraction of genome allocated to secondary metabolite biosynthesis has been verified by Konstantinidis and Tiedje 200430. It has also been described that genomes bigger than 3 Mb are likely to have at least one PK and NRP gene cluster31 while genomes with less than 2000 ORFS, are very likely to not possess secondary metabolism-related genes30. With the progress on sequencing technologies and the increasing number of DNA sequences deposited in public databases, not only the genome mining but also homology-based PCR screening32 started to be used, to screen the biosynthetic potential of the strains before large-scale cultivation and/or sequencing of the entire genome. KS, A and C enzymatic domains possess highly conserved core motifs18, consensus sequences strong enough to allow the design of PCR primers. However, the majority of FCUP 5 Exploring Polar Microbiomes as Source of Bioactive Molecules the primer sets developed were designed to be specific for some bacterial phyla, usually for the most prolific ones as Actinobacteria33, Cyanobacteria34,35 and also for specific genera as Streptomyces36 (Table 3), restricting its usefulness. Together with the development of bioinformatic tools, directed to the identification of biosynthetic gene clusters and discovery of secondary metabolites, the amount of available information (genomes and DNA sequences) led to the creation of a series of natural product biosynthesis-related databases37. Some examples of the created platforms are antiSMASH38, a tool directed to the identification of BGCs and catalytic domains through the analysis of entire genomes/BGCs, NaPDoS11, a web-tool directed to the identification of catalytic domains of PKS and NRPS and NRPSpredictor239 a web server for prediction of the substrate specificity for A domains (see Table 2). FCUP 6 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 2- Some of the principal available bioinformatic tools and databases for supporting natural products discovery, with relevance for this study. Adapted from Medema and Fischbach 20157 and Adamek et al. 201723.

Tool or database Web server URL Brief description Reference AntiSMASH https://antismash.secondarymetabolites.org/ Is a web based-tool for the automatic genomic identification of BGC. 38 NaPDoS http://napdos.ucsd.edu/ Is a web-based tool for a fast identification and analysis of secondary 11 metabolite genes. ESNaPD http://esnapd2.rockefeller.edu/ Is a web server that provides an automated analysis tool for surveying 40 secondary metabolite gene cluster diversity in metagenomics studies. SMURF http://jcvi.org/smurf/index.php Is a web-based tool that finds secondary metabolite biosynthesis genes 41 and pathways in fungal genomes. PRISM http://magarveylab.ca/ prism/ Is a computational tool for the identification of BGC and prediction of 24 genetically encoded NRP and type I and II PK . NRPS/PKS substrate http://nrps.igs.umaryland.edu/ Is a knowledge-based tool for elucidating domain organization and 22 predictor substrate specificity of NRPSs and PKSs. NRPSpredictor2 http:// nrps.informatik.uni-tuebingen.de/ Is a predictor of A domain specificity. 39 NORINE http://bioinfo.lifl.fr/norine/ Is a database of NRPs. 37 FCUP 7 Exploring Polar Microbiomes as Source of Bioactive Molecules

1.2 - Culture independent approach – Metagenomics The ability to efficiently annotate and predict NP biosynthetic genes, as described above, opened the door to culture-independent natural products discovery. In fact, one of the main barriers and challenges faced by traditional natural products research is the ability to isolate and grow the microorganisms in laboratory. It is presumed that the majority of prokaryotes are present in oceanic and terrestrial subsurface environments42, typically remaining inaccessible and unstudied. One gram of soil is expected to contain 107 – 1010 prokaryotic cells 42, equivalent to about 106 different genomes43. The uncultured microorganisms present in soil and other environmental samples represent a rich reservoir of novel natural products43. The strategies currently employed to maximize recovery into culture of the biodiversity present in a given sample include utilization of a variety of media constituents, change of growth conditions, mimicking environmental conditions, using minimal media for oligotrophic sites, consecutive dilution of the original inoculum, community culture and co-culture, among others2,43. Nevertheless, the well- known “great plate count anomaly”44, which refers to a cultivable fraction of the microbial richness below 1%45 is still observed today. Hence, a huge fraction of the biodiversity (and associated chemodiversity) is lost during the culture process in laboratory. Furthermore, because microbial secondary metabolites (i.e., natural products) are sometimes produced in response to some kind of stress or environmental stimulus (e.g. environmental stress, pathogen attack), laboratory cultures of microorganisms under standard growth conditions often do not provide access to the full natural products potential of a given isolate. Against this backdrop, metagenomics has presented as a path to reach the uncultured biodiversity46 and the correspondent biosynthetic diversity. The extraction of DNA from environmental samples (eDNA), i.e. the metagenome, allows by one side, the identification of the bacterial species present (cultured and uncultured ones) through the PCR amplification and sequencing of the 16S rRNA gene2 and by other side can provide information concerning the diversity of biosynthetic genes, through the PCR amplification and sequencing of the catalytic domains, usually KS34, A47 and C48 domains. This PCR- based sequence approach can be used to identify clones of known biosynthetic domains presents in metagenomics libraries as well as to find totally novel molecules produced by known BGCs7. Notably, metagenomics associated with heterologous expression of eDNA has enabled the discovery of different natural products from uncultured microorganisms49. Together with available bioinformatic tools, such as NaPDoS (see Table 2) a web tool useful for the assessment of BGCs diversity though the analysis of phylogenetic relationships of sequence tags from the PKS and NRPS genes11 the identification of new areas of biosynthetic diversity can be performed. Further, using FCUP 8 Exploring Polar Microbiomes as Source of Bioactive Molecules eSNAPD (see Table 2), a web-based bioinformatic platform useful for the discovery of BGCs coding for novel NPs using metagenomics data40, the identification of potential new BGCs and consequent novel molecules can be employed. The afore-mentioned PCR-based approach has resulted in a variety of studies, including some biogeographic studies 50–52 with identification of hotspots of bioactive potential, and more recently for prospecting of Antarctic soils53. In fact, the study of less exploited environments – as are extreme polar environments – has resulted in the discovery of novel species and molecules (reviewed by Wilson and Brimble 200954 in “Molecules derived from the extremes of life”). Not only is a large part of the microbial diversity in these environments still unknown, but also unique adaptations have been developed by the microbiota in these habitats, in order to survive the extreme environmental stresses, including the biosynthesis of exclusive chemical entities with unprecedented biological activities54,55. A limiting issue on this PCR-based approach concerns the available primer sets for amplification of the catalytic domains. The majority of PCR primers for amplification of the biosynthetic domains – mainly for KS and A but also C domains – were initially designed to be specific for certain bacterial phyla, typically the most prolific, as it is the case of Actinobacteria 33 and Cyanobacteria 35. Often these primers were restricted to a catalytic domain class [e.g. exclusive for Type I, subclass modular of PKS gene 56)], in some part, due to the higher number of sequences available for these bacterial groups. However, currently, with the extremely large amount of genome sequences present in public databases, we have access to sequences from a broad variety of bacterial phyla 57, including the more abundant and chemically-prolific. This creates an opportunity for the design of universal primers for PKSs and NRPSs, that would allow to obtain a more accurate representation of the real biosynthetic diversity present in environmental samples. In accordance to this, we aim to combine state-of-the-art culture-dependent and independent approaches to achieve the overall goal of exploring the diversity of bioactive small molecules from polar microbiomes. Specifically, the following are objectives of this dissertation and correspond to Chapters 1, 2 and 3, respectively: (1) To design a primer pair capable of amplify the KS and A domain of PKS and NRPS genes from a wide range of chemically-prolific bacterial phyla. (2) To analyse the biodiversity and the biosynthetic richness of polar microbiomes, through amplification and sequencing of the 16S rRNA gene and PICRUSt predictions. (3) To isolate and grow microorganisms from Antarctic environmental samples and analyse their bioactive potential through in vitro assays. FCUP 9 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 3 – List of primer pairs published for amplification of AD and KS domains of PKS and NRPS genes, respectively.

Domain and respective gene Primer name Sequence (5’-3’) Reference Notes AD – NRPS (700-800bp) A3F (GCSTACSYSATSTACACSTCSGG) 33 Specific for Actinobacteria AD - NRPS (700-800bp) A7R (SASGTCVCCSGTSCGGTA) 33 Specific for Actinobacteria AD – NRPS (480bp) NRPS_F (CGCGCGCATGTACTGGACNGGNGAYYT) 53 Designed for Metagenomic studies AD - NRPS (480bp) NRPS_R (GGTCCGCGGGACGTARTCNARRTC) 53 Designed for Metagenomic studies AD – NRPS (300bp) A2gamF (AAGGCNGGCGSBGCSTAYSTGCC) 58 Conserved motif A2 AD - NRPS (300bp) A3gamR (TTGGGBIKBCCGGTSGINCCSGAGGTG) 58 Conserved motif-A3 AD – NRPS (1000 bp) MTF2 [GCNGG(C/T)GG(C/T)GCNTA(C/T)GTNCC] 35 Specific for Cyanobacteria AD – NRPS (1000 bp) MTR [GCNGG(C/T)GG(C/T)GCNTA(C/T)GTNCC] 35 Specific for Cyanobacteria C – NRPS (700 bp) CnDmF [ATGCATCACATT(AG)TN(TC)(TC)NGA] 48 For metagenomics studies C – NRPS (700 bp) DCCR [GTGTTNAC(AG)AA(AG)AANCC(AGT)AT] 48 For metagenomics studies KS - Type I PKS (700 bp) degK2F [GCIATGGAYCCICARCARMGIVT] 59 Specific for Type I Modular PKS

KS - Type I PKS (700 bp) degK2R [GTICCIGTICCRTGISCYTCIAC] 59 Specific for Type I Modular PKS

KS - Type I PKS (1200-1400bp) K1F [TSAAGTCSAACATCGG BCA] 33 Specific for Actinobacteria

KS - Type I PKS (1200-1400bp) M6R [CGCAGGTTSCSGTACCAGTA] 33 Specific for Actinobacteria

KS – Type I PKS (700 bp) DKF [GTGCCGGTNCC(AG)TG(GATC)G(TC)(TC)TC] 34 Specific for Type I PKS

KS – Type I PKS (700 bp) DKR [GCGATGGA(TC)CCNCA(AG)CA(AG)(CA)G] 34 Specific for Type I PKS

KS – Type I PKS (700 bp) KSF (CGC TCC ATG GAY CCS CAR CA) 60 Specific for Type I PKS KS – Type I PKS (700 bp) KSR (GTC CCG GTG CCR TGS SHY TCSA) 60 Specific for Type I PKS KSα - Type II PKS (600bp) KSα – F (TSG CST GCT TCG AYG CSA TC) 36 Specific for Streptomyces and Type II PKS

KSα - Type II PKS (600bp) KSα – R (TGGAANCCGCCG AAB CCGCT) 36 Specific for Streptomyces and Type II PKS

KSα - Type II PKS (554 bp) 540F (GGITGCACSTCIGGIMTSGAC) 61 Specific for Actinobacteria KSα - Type II PKS (554 bp) 1100R (5’CCGATSGCICCSAGIGAGTG3’) 61 Specific for Actinobacteria KSβ – Type II PKS (1500 bp) dp:KSα (5’TTCGGCGGXTTCCAGTCXGCCATG3’) 62 Specific for Iterative Type II KSβ PKS KSβ – Type II PKS (1500 bp) dp:ACP (5’TCCAGCAGCGCCAXCGACTCGTAXCC3’) 62 Specific for Iterative Type II KSβ PKS KSβ – Type II (350bp) PKS_F (5’GGCAACGCCTACCACATGCANGGNYT3’) 53 Designed for Metagenomic studies KSβ – Type II (350bp) PKS_R (5’GGTCCGCGGGACGTARTCNARRTC3’) 53 Designed for Metagenomic studies FCUP 10 Exploring Polar Microbiomes as Source of Bioactive Molecules

Chapter 1 - Design of primer pairs targeting the biosynthetic domains of PKS and NRPS genes in Bacteria

FCUP 11 Exploring Polar Microbiomes as Source of Bioactive Molecules

I – Background PKSs and NRPSs are mega enzymes responsible for the biosynthesis a large fraction of NPs of pharmacological importance 63. With the recent advances in genome sequencing, it is now recognized that biosynthetic potential is not restricted to the most prolific and well-studied phyla in this regard, as are Actinobacteria and Cyanobacteria 3, but widespread throughout the tree of life 9. Recent bioinformatic studies have suggested that under-explored bacterial groups, previously considered poor in NPs, actually possess the genetic potential to produce secondary metabolites of the NRPS and PKS types 3. Examples are members of the bacterial phyla Verrucomicrobia, Chlamydiae, and Elusimicrobia. PCR-based strategies using primers targeting biosynthetic genes, such as the KS 34 of PKSs, as well as the AD 47 or C 48 domain of NRPSs have been used to assess the bioactive potential of bacterial isolates. More recently, this approach has also been employed to survey the biosynthetic potential in microbiomes/metagenomes, directly from the eDNA 50–52. However, this strategy is currently limited by the use of primers originally designed to be specific to some bacterial phyla, mainly Actinobacteria 33,36. As such, using current molecular tools for PCR-based screening (which is predicted to become even more frequent due to NGS sequencing technologies), some biosynthetic potential remains unreachable, in particular from those phyla that have traditionally been neglected in terms of secondary metabolite biosynthesis. Against this backdrop, we envision that better-performing, universal primer pairs for PKSs and NRPSs can provide useful for large surveys of biosynthetic potential from eDNA, which we expect to become ever more common.

II – Goals

Here, we aimed to design “universal” primer pairs, amenable to NGS-sequencing studies, able to amplify the biosynthetic domains of genes responsible for the production of pharmaceutically-relevant NPs (in this case PKs and NRPs) from a wide range of bacterial phyla (Cyanobacteria, Firmicutes, Chloroflexi, Actinobacteria, , Planctomycetes, Verrucomicrobia and Chlamydiae group (PVC), Deltaproteobacteria, , and ).

FCUP 12 Exploring Polar Microbiomes as Source of Bioactive Molecules

III – Materials and Methods

1 - Design of oligonucleotide primers

1.1 -Sequence retrieval for KS Domain of Type I PKS genes and AD domain for NRPS genes KS Domain sequences were retrieved for the seven described groups of Type I PKS (Enediyne, PUFA, Trans-AT Hybrid, Iterative, Modular and KS1 11). Ten bacterial phyla and one bacterial group (PVC), i.e. the most abundant according to the most recent tree of life 57 were selected with the intent of covering most of the potential biodiversity. The selected Phyla are described in Table 4. Aminoacidic KS Domain sequences from already characterized molecules for each PKS class were retrieved from the NaPDoS database 11. The aa sequences were submitted to the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) for a tblast(n) search, and the correspondent nucleotide sequences were retrieved. The nucleotide sequences served then as an initial query for a blast(n) search at the nucleotide collection database at NCBI for each of the bacterial phyla. The genomes/biosynthetic gene clusters with homologues were submitted to an analysis antiSMASH 3.0 38. The Protein sequences of KS domains obtained through antiSMASH analysis were recovered and through a tblast(n) search, the correspondent nucleotide sequence was obtained. The whole nucleotide sequences were submitted to an analysis with NaPDoS, to get an insight into its class, and grouped in accordance to the result obtained. The resultant nucleotide sequences served also as query for the remaining ketosynthases classes. For A domain of NRPS genes, the NORINE 37,64 and NaPDoS databases were used to select the query sequences. An identical approach to KS domains was followed.

FCUP 13 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 4 - Distribution of KS and AD domain sequences selected from the 10 bacterial phyla and group in study.

Bacterial Phyla Number of sequences retrieved KS domain NRPS domain Actinobacteria 8 11 Cyanobacteria 8 11 Firmicutes 10 14 Chloroflexi 5 4 Bacteroidetes 5 10 Elusimicrobia 1 2 PVC Group 8 3 Deltaproteobacteria 12 10 Alphaproetobacteria 7 9 Betaproteobacteria 6 7 Gammaprotobacteria 6 9

1.2 - Multiple-sequence alignment and Phylogenetic Analysis A total of 76 and 90 sequences of KS (Supplementary Table S 1) and A (Supplementary

Table S 2) domains collected in NCBI, repectively, were used. The nucleotide sequences were aligned through ClustalW (default parameters) in MEGA7 65. The alignments were manually reviewed to remove short sequences and the extremities trimmed. For primer design, conserved sites were fixed at 80% in MEGA 7 and the alignment submitted to Geneious 8.1.966, with a consensus threshold of 75% to favour the determination of the most conserved sites. The selected conserved sites were inspected and the degenerate primer pairs were manually designed. The aligned sequences were also subjected to a phylogenetic analysis with MEGA7. Phylogenies were reconstructed using Neighbor- joining (NJ), with 1000 bootstrap replicates.

1.3 - In silico analysis of primers reliability

The designed primer sequences (Supplementary Table S 3) were submitted to a variety of online servers to calculate their properties as well as to test by virtual PCRs their predicted performance. For oligonucleotide primer properties, the primer sequences were submitted to: OligoCalc version 3.27 67, OligoAnalyzer 3.1 Tool (Integrated DNA Technologies-http://eu.idtdna.com/calc/analyzer) and Multiple primer analyser (ThermoFisher Scientific-https://www.thermofisher.com/pt/en/home/brands/thermo- scientific/molecular-biology/molecular-biology-learning-center/molecular-biology- resource-library/thermo-scientific-web-tools/multiple-primer-analyzer.html). In silico FCUP 14 Exploring Polar Microbiomes as Source of Bioactive Molecules

PCR amplifications were performed with In silico simulation of molecular biology experiments - http://insilico.ehu.es/ 68,69, iPCR (product extractor) - http://www.ch.embnet.org/software/iPCR_form.html and Sequence Manipulation Suite: PCR Products - http://www.bioinformatics.org/sms2/pcr_products.html 70.

1.4- Optimization of the PCR Amplification protocol

1.4.1 – Genomic DNA extraction and quantification Genomic DNA from Cyanobacteria, Actinobacteria, Planctomycetes, Beta- proteobacteria, Gamma-proteobacteria and Alpha-proteobacteria strains were used to test the efficiency of the designed primers. Only bacteria with already sequenced genomes were selected, except for Cyanobacteria, which were selected based on the presence of PKS and NRPS genes as detailed in a previous study 71. The genomes of the selected bacteria were submitted to antiSMAH analysis to survey the presence of PKS and NRPS genes. A detailed list of the bacterial strains used is depicted on (Supplementary Table S 4). For DNA extraction, bacteria (except Cyanobacteria) were grown in 5 mL of liquid culture media in 50 mL falcons, at constant agitation (100 rpm) at 27 ºC. ML14 72, Marine broth (MB) and Luria Broth (LB) media were used for Planctomycetes, Proteobacteria and Chromobacteria violaceum, respectively. Planctomycetes strains were gently ceded by Olga Lage (CIIMAR/FCUP, Porto, Portugal). Genomic DNA from Streptomyces was a kind gift from Marta Vaz Mendes (i3S, Porto, Portugal). The gDNA was then extracted using the E.Z.N.A.® Bacterial DNA Kit (OMEGA bio-tek). The manufacturer’s instructions were followed and DNA eluted with elution buffer in a final volume of 100 µL. The integrity of the gDNA was visualized through an electrophoresis gel, a 0.8% agarose gel prepared in Tris-acetate-EDTA (TAE) buffer 1x. and stained with 1 µL of SYBR® Safe DNA Gel Stain (ThermoFisher Scientific). One microliter of DNA (with loading dye) was loaded onto each lane and the gel submitted to an electrophoresis at 80 V for 30 minutes. For cyanobacterial gDNA extraction, fresh biomass from Z8 73 liquid medium (about 2 mL) was harvested for each selected cyanobacterium. The gDNA was then extracted and purified using the Purelink Genomic DNA Mini Kit (Invitrogen), applying the Gram Negative Bacterial Cell Protocol. The DNA concentration was measured in a Qubit® 3.0 Fluorometer (Life technologies) by using a Qubit® dsDNA HS Assay Kit (Life Technologies). The manufacturer’s instructions were followed and 1 µL of each gDNA was used. The gDNAs were FCUP 15 Exploring Polar Microbiomes as Source of Bioactive Molecules normalized to a final concentration of 25 ng µL-1, unless the initial concentration was lower than this value. 1.4.2. – Optimization of PCR reactions: reagent and thermal conditions To determine the best PCR conditions, including improved specificity and strong amplification, different conditions and reagents were tested. Three different Taq DNA Polymerases were employed: Gotaq (Promega), DreamTaq (Thermo Fisher Scientific) and TaKaRa hot start version (Clontech). As per the manufacturer’s instructions, the basic PCR protocol used with GoTaq Polymerase consisted of: 1× Green GoTaq® Flexi Buffer (Promega), 2.5 mM MgCl2 (Promega), 500 μM of DNTP Mix (Promega), 1 μM of each of the primers (STABVIDA), 0.5 U of GoTaq® DNA Polymerase (Promega) and of 2 μL template DNA in 20 μL of reaction. The standard PCR conditions executed were: initial denaturation step at 94 ºC for 10 min, followed by 40 cycles of a denaturation step at 94 ºC during 30 s, annealing (determined temperature) for 30 s and extension at 72 ºC for 1 min and a final extension step at 72 ºC for 7 min. The basic protocol for DreamTaq consisted of: 1x Dream Taq PCR Mastermix, 1 μM of each primer and 2 μL of template DNA in a final volume of 25 μL. The standard PCR conditions executed: initial denaturation step at 95 ºC for 2 min, followed by 30 cycles of denaturation step at 95 ºC for 30 s, annealing (ºC determined) for 30 s, extension for 1 min, and a final extension of 10 min at 72 ºC. The basic protocol for TaKara consisted in: 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 1 μM of each of the primers (STABVIDA), and 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA in a final volume of 20 μL. The PCR conditions executed were: initial denaturation step at 98 ºC for 2 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at the determined temperature for 30 s and extension at 72 ºC for 1 min, followed by a final extension step at 72 ºC for 5 min. Variations to the reaction mixtures mentioned above were carried out and included: gradient of primer concentration, MgCl2 and DNTPs concentration conjugated with presence/absence of UltraPureTM BSA (Life teschnologies), and gradient of concentration (1-3%) of DMSO (Thermo Scientific). Concerning the thermal cycling protocol, initially, a gradient of annealing temperatures was performed for each primer pair to determine the best annealing temperature. Extension time and number of cycles were also object of optimization. A Touchdown PCR protocol 74 was also employed for the Taq polymerase TaKaRA hot start version. The protocol consisted of: initial denaturation step at 95 ºC for 3 min, followed by 10 cycles of a denaturation step at 95 ºC for 30 s, annealing at 75 ºC for 45 s and extension at 72 ºC for 25 s, followed by 25 FCUP 16 Exploring Polar Microbiomes as Source of Bioactive Molecules cycles of a denaturation step at 95 ºC for 30 s, annealing at 60 ºC for 45 s, extension at 72ºC for 25 s and a final extension step at 72ºC for 5 min.

1.4.3 – Comparison of amplification results for protocols using designed primers or literature primers The primer pairs degK2F/degK2R 59 and A3F/A7R 33 were used for PCR amplification of KS and AD domain and were the benchmark against which the designed primers were compared. For the literature primer pairs, the thermal cycling was based on the literature protocols and were performed at Veriti® 96-Well Thermal Cycler (ThermoFisher Scientific). The PCR reactions were prepared in a volume of 20 μL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 0.625 μL of primer (100 μM), 0.25 mg/mL of UltraPureTM BSA (Life technologies), 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA. The PCR conditions executed were: initial denaturation step at 95 ºC for 4 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at 67,5 ºC for 30 s and extension at 72 ºC for 60 s, followed by a final extension step at 72 ºC for 5 min, for amplification of AD domain using primer pair A3F/A7R. For primer pair degK2F.i/degK2R.i the conditions executed were: initial denaturation step at 95 ºC for 4 min, followed by 40 cycles of a denaturation step at 94 ºC for 40 s, annealing at 56,3 ºC for 40 s and extension at 72 ºC for 75 s, followed by a final extension step at 72 ºC for 5 min. PCR products (10 μL loaded onto each well) were separated by electrophoresis on a 1.5% (w/v) agarose gel during 40 minutes at 120 V, together with 5 μL of GRS ladder 1 kb (Grisp). Gel was stained with 1 μl SYBR® Safe DNA Gel Stain (ThermoFisher Scientific), visualized under UV-light at Gel Doc XR+ System (BIO-RAD) and analysed with Image Lab™ software (BIO-RAD).

1.5 Sequencing of PCR products – Test and Phylogenetic analysis After the PCR protocol optimization, an initial test with the following gDNAs was carried out to evaluate primer functionality. Nodosilinea nodulosa LEGE 06152, Cobetia marina CECT 4278, Halomonas aquamarina CECT 5000 and Streptomyces natalensis ATCC 27448 gDNAs were used for KS domain amplification. The AD domain was amplified from gDNA of Nodularia sp. LEGE 06071 and Streptomyces natalensis ATCC 27448. The thermal cycling was performed at Veriti® 96-Well Thermal Cycler (ThermoFisher Scientific). For amplification of KS domain, the PCR reaction was prepared in a volume of 20 μL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 1 μM of each of the primers (KSF1/KS_v2_Rv - STABVIDA), 3% DMSO, 0.5 U TaKaRa Taq™ Hot Start Version FCUP 17 Exploring Polar Microbiomes as Source of Bioactive Molecules

(TAKARA BIO INC) and 2 μL of template DNA. The PCR conditions executed were: initial denaturation step at 98 ºC for 2 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at 55 ºC for 30 s and extension at 72 ºC for 22s, followed by a final extension step at 72 ºC for 5 min. For amplification of AD domain, the reaction was prepared in a volume of 20 μL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 1 μM of each of the primers, 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA. The PCR conditions executed were: initial denaturation step at 98 ºC for 2 min followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at 54 ºC for 30 s and extension at 72 ºC for 22 s, followed by a final extension step at 72 ºC for 5 min. For validation of the PCR reaction, 5 μL of PCR products were separated by electrophoresis on a 1.5% (w:v) agarose gel during 40 min at 120 V. GRS ladder 100 bp (GriSp) was used (5 μL loaded). Gel was stained with 1 μl SYBR® Safe DNA Gel Stain (ThermoFisher Scientific), visualized under UV-light at Gel Doc XR+ System and analysed with Image Lab™ software. After validation of the reaction, PCR products (15 µL of PCR product loaded onto each well) were separated by electrophoresis on a 1% (w/v) agarose gel during 60 min at 150 V, stained with 1 μL SYBR® Safe DNA Gel Stain (ThermoFisher Scientific). The bands were visualized under UV-light with Gel Doc XR+ System and excised using sterile scalpels. The bands were purified using the kit NZYGelpure (nzytech) and sequenced by Sanger sequencing at STABVIDA (Portugal). Briefly, the following components were used: igDye ® Terminator v3.1 Cycle Sequencing Kit [Applied Biosystems]; BigDye® Terminator v1.1, v3.1 5 Sequencing Buffer [Applied Biosystems]; primer 10 μM; nuclease-free water (Ambion); purified PCR product. The sequencing products were purified with illustra™ Sephadex™ G-50 Fine DNA Grade and submitted to an automated capillary electrophoresis on ABI 3730xl Genetic Analyzer sequencer (Applied Biosystems). The Visual quality control of the electropherograms was performed in Sequence Scanner v1.0 (Applied Biosystems). Raw forward and reverse sequences (ab1 files) were submitted to Geneious 8.1.9 66 for de novo assembling. The resulting consensus sequences (average length 300 and 240 bp for KS and AD domain, respectively) were submitted to NCBI for a blast(n) search against the nucleotide collection database. 1.5.1 – Phylogenetic and NaPDoS analysis The obtained sequences were aligned with the sequences used for primer design and submitted to a phylogenetic analysis, to survey the diversity covered. KS domain FCUP 18 Exploring Polar Microbiomes as Source of Bioactive Molecules sequences were also classified using the web tool NaPDoS and a phylogenetic tree (using the NaPDoS database as reference) was constructed.

IV – Results

Alignments composed of 63 and 84 sequences with 1462 and 1506 bp for KS and AD domain, respectively, were obtained as a basis for primer design. For each alignment, a phylogenetic analysis was performed to inspect the diversity covered by the selected sequences. For the KS domain, according to the phylogenetic tree (Figure 1) it is possible to observe that the sequences are diverse and encompass all the documented classes of KSs, with a clear clustering pattern linked to function. Likewise, the phylogeny of the AD domain (Figure 2) included the known diversity of ADs and the clustering pattern was congruent with the type of domain. A series of conserved zones were selected for primer design. In total four forward and two reverse primers were designed for the KS domain, in different regions of the gene

(Table S 3). For the AD domain, five forward and four reverse primers were designed. Initially, for KS domain, primer pairs KSF1/KSR1 and KSF2/KSF1 were designed, which, according to the in silico analysis, seemed very robust. For the AD domain, initially primers ADFw1/ADRv1, ADFw2 and AFw2.1/ADRv2 and ADFw3/ADRv3 were designed. However, the PCR amplification originated quite a few non-specific bands in both cases, or no bands at all, even after the attempts to optimize the conditions. A second iteration of primer design was performed, either by designing in new regions or by decreasing the degeneracy of the primers, that yielded the primers designated as v2

(Table S 3). Several combinations of primer pairs were tested, and the pairs KSF1/KS_v2_Rv and NRPS_v2_Fw/ADRv3 proved to be the more reliable. When comparing PCR amplifications, using these designed primers, albeit with a non-optimized protocol, to the currently used literature primer pairs, is possible to observe a band with the expected bp in almost all the strains tested (Figure 3 and Figure 4), but the literature primers fared better. Amplification of AD domain, gave a good indication that the designed primers are able to recover a broader range of diversity, as product with the expected bp was obtained in cyanobacterial gDNA with designed primers and not with primer pair A3F/A7R tested (Figure 3). Also, amplification with gDNA of Proteabacteria yielded similar results (Supplementary Figure S 1).

FCUP 19 Exploring Polar Microbiomes as Source of Bioactive Molecules

99 nosB 46 Hybrid Herpetosiphon aurantiacus DSM 785 Hybrid KS Domain 67 ituA Trans-AT jamE KS1 34 elaJ 48 Trans-AT 49 38 baeM 44 Herpetosiphon aurantiacus DSM 785 KS1 KS Domain Nitrosomonas europaea ATCC 19718 Iterative KS domain 32 Spirosoma radiotolerans strain DG5A KS Domain Modular 0 Rhodopirellula baltica SH 1 Modular KS Domain Modular Herpetosiphon aurantiacus DSM 785 Modular KS Domain epoA 015 pltC 20 33 Methylobacterium sp. 4-46 Modular KS Domain hliP 0 Opitutus terrae PB90-1 Modular KS Domain 15 Iterative 53 Opitutus terrae PB90-1 Iterative KS Domain aviM Modular

99 vinP2 Trans-AT Achromobacter xylosoxidans strain FDAARGOS 147 Modular KS Domain 10 98 Lysobacter sp ATCC 53042 Trans-AT KS Domain 95 36 lnmI Burkholderia gladioli BSR3 Trans-AT KS Domain Hymenobacter sp. PAMC 26554 Hybrid KS Domain Hybrid 7 53 Paenibacillus mucilaginosus K02 Hybrid KS Domain 80 30 Opitutus terrae PB90-1 Hybrid KS Domain 37 blmVIII 97 var4 40 Alcanivorax pacificus W11-5 Hybrid KS Domain 23 Paracoccus denitrificans PD1222 Hybrid KS Domain 26 47 mtaD 23 Singulisphaera acidiphila DSM 18658 KS1 KS Domain Gloeobacter violaceus PCC 7421 DNA Trans-AT KS Domain 4 tmnAI stiA KS1 51 31 53 sorA gulB Iterative 96 dszA 99 Xanthobacter autotrophicus Py2 Iterative KS Domain 97 Methylobacterium radiotolerans JCM 2831 Enediyne KS Domain 99 Burkholderia cenocepacia strain ST32 Iterative KS Domain 59 Tistrella mobilis KA081020-065 Iterative KS Domain 29 Singulisphaera acidiphila DSM 18658 Iterative KS Domain 26 Paenibacillus mucilaginosus K02 Iterative KS Domain 52 HSAF 79 Xanthobacter autotrophicus Py2 PUFA KS Domain 74 Rubrivivax gelatinosus IL144 PUFA KS Domain Corallococcus coralloides DSM 2259 PUFA KS Domain 76 pfaA marina 99 HglE PUFA 97 PfaA Shewanella violacea DSS12 43 82 Elusimicrobium minutum Pei191 PUFA KS Domain 29 Roseiflexus castenholzii DSM 13941 PUFA KS Domain Planctomyces sp. SH-PL62 PUFA KS Domain 82 Pandoraea oxalativorans strain DSM 23570 Enediyne KS Domain 26 Methylococcus capsulatus strain Bath Enedyine KS Domain 99 calE8 Haliangium ochraceum DSM 14365 Enediyne KS Domain Enediyne 83 Microcystis aeruginosa NIES-843 Enediyne KS Domain 72 99 Enediyne KS Domain Herpetosiphon aurantiacus DSM 785 99 jamG Modular velezensis strain CC09 Modular KS Domain FAS Streptomyces sp. 2114.2 FAS 98 FAS Escherichia coli strain FORC 031

0.1

Figure 1 – Phylogenetic tree of the KS domain sequences collected for primer design and the respective class to which they belong. The tree was computed in MEGA7 110, reconstructed using the Neighbor-Joining 182 and bootstrap method (1000 replications) and englobed 67 nucleotide sequences with 1462 bp. Fatty acid synthase (FAS) sequences from E.coli and Streptomyces sp. were included as outgroup.

FCUP 20 Exploring Polar Microbiomes as Source of Bioactive Molecules

mycB bamB ituC bacB licB bacC bacA tycB licA fenD Spirosoma radiotolerans strain DG5A(SD10 02305) Elusimicrobium minutum Pei191 AD Domain (Emin 0995) Pantoea agglomerans(AAO39110.1) vibF Elusimicrobium minutum Pei191 AD Domain(Emin 1012) dhbF Filimonas lacunae NBRC 104114 (FLA 1939) Chryseobacterium gallinarum strain DSM 27622 (OK18 15880) Spirosoma radiotolerans strain DG5A(SD10 09935) Mucilaginibacter gotjawali (MgSA37 03143) Filimonas lacunae NBRC 104114 (FLA 1304) Herpetosiphon aurantiacus DSM 785 AD domain (Haur 1574) ndaA crs2 crs1 ablD Winogradskyella sp. PG-2 (WPG 0383) nosA mcnB adpA ociB aptA1 ndaB aptA2 mcnA adpB Flammeovirgaceae bacterium 311 (D770 00005) Herpetosiphon aurantiacus DSM 785 AD Domain(Haur 1805) Herpetosiphon aurantiacus DSM 785 AD domain(Haur 2091) Herpetosiphon aurantiacus DSM 785 (Haur 3129) Azotobacter vinelandii CA6 (AVIN RS11710) Collimonas sp. MPS11E8 (CCT ORF03016) cbsF Rhizobium leguminosarum Vaf10 AD Domain (BA011 36690) Rhizobium leguminosarum strain Vaf10 (BA011 37190) massB Erwinia amylovora ATCC BAA-2158 (EAIL5 3813) Xanthomonas oryzae pv. oryzicola strain RS105(ACU12 09555) ofaA arfA vlm1 acmB mscH antB mscF Methylocella silvestris BL2(Msil 0855) Bradyrhizobium oligotrophicum S58 AD domain (S58 21570) Variovorax paradoxus EPS (Varpa 4519) orbI Burkholderia cepacia GG4 AD domain(GI:402247746) Methylobacterium populi BJ001 (Mpop 5163) Methylobacterium extorquens CM4 (Mchl 5090) depD Delftia sp. Cs1-4(DelCs14 2100) Azospirillum thiophilum strain BV-S (AL072 22320) Tistrella mobilis KA081020-065 (TMO c0602) cndF Streptomyces coelicolor A3 (SCO6431) ncyE qui6 melC cmnA nocB nocA Hymenobacter sp. PAMC 26554(A0257 16655) chiD tubC Myxococcus fulvus 124B02(MFUL124B02 24325) Ralstonia solanacearum (RSp1422) mtaD nosB blmVIII

Figure 2 - Phylogenetic tree of the AD domain sequences collected for primer design. The tree was computed in MEGA7 [63], reconstructed using the Neighbor-Joining [135] and bootstrap method (1000 replications) and englobed 82 nucleotide sequences with 1506 bp. Sequences from hybrid PKS-NRPS genes, mtaD, nosB and blmVIII were included as outgroup. FCUP 21 Exploring Polar Microbiomes as Source of Bioactive Molecules

According to these promising results, attempts were performed to optimize the specificity, i.e. to decrease the number of non-specific bands, and to improve the amplification, i.e to obtain strong bands and from the largest number of strains possible. The optimized conditions for PCR amplification of KS domain obtained were: 1x TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 1 μM of each of the primers (KSF1/KS_v2_Rv - STABVIDA), 3% DMSO, 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA in a final volume of 20 μL. The optimized thermal cycling program consisted of: initial denaturation step at 98 ºC for 2 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at 55 ºC for 30 s and extension at 72 ºC for 22s, followed by a final extension step at 72 ºC for 5 min. For the AD domain, the optimized conditions were 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 1 μM of each of the primers (KSF1/KS_v2_Rv - STABVIDA), 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA. The optimized program was composed of an initial denaturation step at 98 ºC for 2 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 s, annealing at 54 ºC for 30 s and extension at 72 ºC for 22 s, followed by a final extension step at 72 ºC for 5 min.

A B Primer degK2F/degK2R Primer KSF1/KS_v2_Rv M 1 2 3 4 C+ C- 1 2 3 4 C+ C- M 5000 bp 1500 bp

1000 bp

500 bp

500 bp 200 bp

C Primer A3F/A7R D Primer NRPS_v2_Fw/ADRv3 M 1 2 3 4 C+ C- M 1 2 3 4 C+ C-

1500 bp

Cyanobacteria 5000 bp

500 bp 1000 bp

500 bp 200 bp

Figure 3 - PCR amplification of KS and AD domain from cyanobacterial gDNA. (A) Amplification of KS domain using primer pair degK2F/degK2R (expected 700 bp) and (B) using the designed primer pair KSF1/KS_v2_Rv (expected 350 bp). (C) PCR amplification of AD domain using primer pair A3F/A7R (expected 700 bp) and (D) the designed primer pair NRPS_v2_Fw/ADRv3 (expected 240 bp). Legend: 1 – LEGE 91339 gDNA, 2 – LEGE 07179 gDNA, 3 – LEGE 06152

gDNA, 4 – LEGE 06071 gDNA, C+ – gDNA from Streptomyces avermitilis MA-4680 (positive control), C- – negative control, M – GRS ladder 1kb and 100 bp (GriSp).

FCUP 22 Exploring Polar Microbiomes as Source of Bioactive Molecules

A Primer degK2F/degK2R B Primer KSF1/KS_v2_Rv M 1 2 3 4 5 C- M 1 2 3 4 5 C-

1500 bp 5000 bp

1000 bp 500 bp

500 bp 200 bp

200 bp Primer A3F/A7R Primer NRPS_v2_Fw/ADRv3 C D M 1 2 3 4 5 6 C- M 1 2 3 4 5 6 C- 5000 bp 1500 bp

1000 bp 500 bp

500 bp Planctomycetes and Streptomyces and Planctomycetes 200 bp

Figure 4 - PCR amplification of KS and AD domain from gDNA of Planctomycetes and Streptomyces strains. (A) Amplification of KS domain using primer pair degK2F/degK2R (expected 700 bp) and (B) using the designed primer pair KSF1/KS_v2_Rv (expected 350 bp). (C) PCR amplification of AD domain using primer pair A3F/A7R (expected 350 bp) and (D) the designed primer pair NRPS_v2_Fw/ADRv3 (expected 240 bp). Legend:.1 – gDNA from Nodularia sp. LEGE 06071, 2 – gDNA from Planctomycetes strain FC18, 3 - gDNA from Roseimaritima ulvae UC8 strain, 4 – gDNA from S. griseus subsp griseus, 5 - gDNA from Streptomyces avermitilis MA-4680, 6 – gDNA from

S. natalensis ATCC 27448, M – GRS ladder 100 bp and 1kb (GrisP).

Under such optimized conditions, both domains were amplified by PCR and sequenced by sanger sequencing. KS domain was amplified by PCR using the optimized conditions, using gDNA from Nodosilinea nodulosa LEGE 06152, Cobetia marina CECT 4278, Halomonas aquamarina CECT 5000 and Streptomyces natalensis ATCC 27448. AD domain was amplified from gDNA from Nodularia sp. LEGE 06071 and Streptomyces natalensis ATCC 27448. From the Figure 5, is possible to observe an amplified product with the expected bp length for all the tested strains. However, with Streptomyces natalensis ATCC 27448 gDNA (lane 4 and 6), two bands very close in the expected region are present, which could not be properly separated during the band excision. PCR amplification of KS domain with gDNA from the cyanobacterial strain Nodosilinea nodulosa LEGE 06152 produced a sequence with 351 bp and 93% quality. The best blast(n) match at NCBI nucleotide collection database was of 74% identity to a PKS sequence from the genome of the cyanobacterium Halomicronema hongdechloris C2206 (GenBank: CP021983.1). This species belongs to the same family (Leptolyngbyaceae) of the tested strain. The obtained sequence was also submitted to FCUP 23 Exploring Polar Microbiomes as Source of Bioactive Molecules an analysis in NapDos, that revealed 55% of identity to the KS1 domain involved in tylosin biosynthesis 75.

KS Domain AD Domain

M 1 2 3 4 C- 5 6 C-

1500 bp

500 bp

200 bp

Figure 5 – Eletrophoresis gel of PCR products of KS (expected 350 bp) and AD (expected 240 bp) domains amplification using the optimized conditions. Legend: M – 100 p ladder (Grisp), 1 – PCR product of PKS gene amplification from gDNA of LEGE 06152, 2 - PCR product of PKS gene amplification from gDNA of Cobetia marina, 3 - PCR product of PKS gene amplification from gDNA of Halomonas aquamarina, 4 - PCR product of PKS gene amplification from gDNA of Streptomyces nataliensis, C- – negative control of PKS reaction, 5 - PCR product of NRPS gene amplification from gDNA of Nodularia sp. LEGE 06071, 6 - PCR product of NRPS

gene amplification from gDNA of Streptomyces nataliensis, C- - negative control of NRPS reaction.

Amplification of KS domain using C. marina gDNA, yielded a sequence of 340 bp with 92% quality. The best blast(n) match at NCBI nucleotide collection database was of 99% identity to the genome of Cobetia marina strain JCM 21022, as expected. When inspecting the correspondent region on the genome, it was annotated as hypothetical protein. According to the NapDos result, a 42% identity to the modular KS domain involved in the biosynthesis of rifamycin was obtained. Lane 4 (figure 5) was expected to correspond to the amplification of KS domain from gDNA of Halomonas aquamarina, however, a sequence from C. marina was obtained, probably due to some technical mistake. Finally, from the amplification of the KS domain from gDNA of Streptomyces nataliensis, only the forward sequence was properly sequenced, with 236 bp with 61% of quality. The best blast(n) match was of 85% identity to the pimaricin biosynthetic gene cluster from the tested strain. The NapDoS analysis revealed 77% of identity to the modular domain involved in production of nystatin, yet with a coverage of only 26%. Concerning the amplification of AD domain, for Nodularia sp. LEGE 06071, the amplification (after assembling) yielded a sequence of 266 bp with 89% of quality. The best blast(n) match at NCBI was of 90 and 86% of query coverage and identity, FCUP 24 Exploring Polar Microbiomes as Source of Bioactive Molecules respectively, to an NRPS gene from Cyanobacterium sp. LLi5 Clone 5.5 (FJ603047.1) The second match was to the genome of Nostoc sp. NIES-4103 (AP018288.1).

34 Alcanivorax pacificus W11-5 Hybrid KS Domain 10 var4 0 mtaD Burkholderia gladioli BSR3 Trans-AT KS Domain 0 2 Hymenobacter sp. PAMC 26554 Hybrid KS Domain 10 Lysobacter sp ATCC 53042 Trans-AT KS Domain Paracoccus denitrificans PD1222 Hybrid KS Domain 0 16 blmVIII 15 Opitutus terrae PB90-1 Hybrid KS Domain 27 Opitutus terrae PB90-1 Iterative KS Domain tmnAI 0 5 Paenibacillus mucilaginosus K02 Hybrid KS Domain Burkholderia cenocepacia strain ST32 Iterative KS Domain 13 60 Opitutus terrae PB90-1 Modular KS Domain Nodosilinea nodulosa KS domain 1 dszA 0 32 gulB 75 Gloeobacter violaceus PCC 7421 DNA Trans-AT KS Domain 16 stiA 1 Singulisphaera acidiphila DSM 18658 KS1 KS Domain HSAF 24 Paenibacillus mucilaginosus K02 Iterative KS Domain 0 88 Methylobacterium radiotolerans JCM 2831 Enediyne KS Domain Xanthobacter autotrophicus Py2 Trans-AT KS Domain 39 Singulisphaera acidiphila DSM 18658 Iterative KS Domain 0 0 Tistrella mobilis KA081020-065 Iterative KS Domain 0 Cobetia marina KS domain 6 sorA 19 Xanthobacter autotrophicus Py2 PUFA KS Domain 88 Streptomyces nataliensis KS domain 68 vinP2 15 Achromobacter xylosoxidans strain FDAARGOS 147 Modular KS Domain 3 aviM hliP 69 lnmI 0 0 Roseiflexus castenholzii DSM 13941 PUFA KS Domain 31 Corallococcus coralloides DSM 2259 Iterative KS Domain 12 Rubrivivax gelatinosus IL144 PUFA KS Domain 0 Planctomyces sp. SH-PL62 PUFA KS Domain calE8 8 87 Enediyne KS Domain Herpetosiphon aurantiacus DSM 785 46 Microcystis aeruginosa NIES-843 Enediyne KS Domain 0 29 Haliangium ochraceum DSM 14365 Enediyne KS Domain 35 Methylococcus capsulatus strain Bath Enedyine KS Domain 78 Pandoraea oxalativorans strain DSM 23570 Enediyne KS Domain Methylobacterium sp. 4-46 Modular KS Domain Herpetosiphon aurantiacus DSM 785 Modular KS Domain 0 epoA 8 5 34 pltC Nitrosomonas europaea ATCC 19718 Iterative KS domain Rhodopirellula baltica SH 1 Modular KS Domain 9 48 Spirosoma radiotolerans strain DG5A KS Domain Modular Herpetosiphon aurantiacus DSM 785 KS1 KS Domain Herpetosiphon aurantiacus DSM 785 Hybrid KS Domain 0 33 baeM 0 jamE 41 elaJ 4 nosB 21 Elusimicrobium minutum Pei191 PUFA KS Domain HglE 37 pfaA 90 39 PfaA Shewanella violacea DSS12 ituA 99 Bacillus velezensis strain CC09 Modular KS Domain jamG 52 FAS Escherichia coli strain FORC 031 56 FAS Streptomyces sp. 2114.2

0.1

Figure 6 – Phylogenetic tree encompassing the KS domain sequences used for primer design and the obtained sequences from Nodosilinea nodulosa, Cobetia marina and Streptomyces nataliensis gDNa 110 182 The tree was computed in MEGA7 , reconstructed using the Neighbour-Joining and bootstrap method (1000 replications) and englobed 67 nucleotide sequences with 257 bp. Fatty acid synthase (FAS) sequences from E.coli and Streptomyces sp. were included as outgroup.

FCUP 25 Exploring Polar Microbiomes as Source of Bioactive Molecules

For AD domain amplification from S. nataliensis gDNA, both sequences were of poor quality, probably due to the presence of a second band, and a sequence of 274 bp with low quality was obtained. The best blast(n) match at NCBI was of 85 and 73% of query cover and identity, respectively, to the genome of Streptomyces gilvosporeus strain F607 (CP020569.1). The first hit was identified as hypothetical protein, but when inspecting the protein ID it was defined as NRPS. The obtained sequences were aligned with the sequences retrieved for primer design and a phylogenetic tree was consctructed. According to Figure 6, the obtained sequences KS sequences belong to different KS classes, as they group in different clusters on the phylogenetic tree, indicative that the primers are able to retrieve sequences from a diverse range of KS classes. The sequence from Nodosilinea nodulosa groups with gulB and dsza, that belong to KS1 class. The Streptomyces sequence grouped within the modular class, while the KS sequence retrieved from Cobetia marina seems to cluster together with sequences belonging to KS1 and PUFA domains, but not well supported statistically. KS sequences were also submitted to a phylogenetic analysis against NaPDoS database. The results (not shown) were congruent with the obtained, and the C. marina sequence clustered within PUFA clade. NaPDos only accepts C domain of NRPS gene, and thus was not possible to perform the analysis with the AD domain.

FCUP 26 Exploring Polar Microbiomes as Source of Bioactive Molecules

V - Discussion

The new primer pairs, designed in this study, are able to retrieve the biosynthetic genetic information from a wide-range of bacterial phyla, including under-explored groups, as Verrucomicrobia and Elusimicrobia phyla 3.The primers appear able to efficiently retrieve information from bacterial strains from Cyanobacteria, Actinobacteria, Planctomycetes, Alpha, Beta and Gamma Proteobacteria. Comparing to the currently used primer pairs, it is noticeable, especially for AD domain, an improvement, with amplification from a wider range of strains. Primer pair A3F/A7R was designed to recognize the conserved A3 and A7 regions in NRPS AD domains and were designed to be specific for actinomycetes 33. Also, according to the phylogenetic analysis, the primers recover sequences from diverse, phylogenetically segregated KS classes. These primers will likely be useful for screening programs with a wide range of taxonomical groups, to select for example promising strains, but more importantly to survey the true biosynthetic potential of microbiomes for the phyla described and to perform metagenome-guided bioprospection, for example to understand which biomes harbour rich biosynthetic diversity. It was demonstrated that the complete sequence can be directly sequenced with good enough quality for validating the amplicon. When comparing to the currently primers used, the amplicon size (250-300 bp) is advantageous as it can be completely sequenced by Illumina Sequencing technology. However, it is still unclear whether our designed primers perform better than the already published ones, namely in terms of their ability to retrieve a higher diversity of KS or AD sequences. To test this, future work will include screening of mock communities and eDNA by amplicon metagenomics using both the newly designed and the literature primers, followed by diversity analysis. Overall, in this chapter, we report our efforts to develop an alternative to supress the current limitation of PCR-based bioprospection strategy, with the design of improved primer pairs. While it is too early to affirm that our primers, in contrary to the already published, can efficiently retrieve biosynthetic genetic information from the most abundant bacterial phyla, our results indicate that this is a strong possibility. Such achievement would revitalize the bioprospection PCR-based strategy, as higher biosynthetic diversity, even from overlooked phyla will be retrieved, which would lead invariably to the discovery of novel hotspots of biosynthetic diversity and novel molecules. Therefore, our new primers may indeed become a new standard for PCR- based screening and metagenomic bioprospection.

FCUP 27 Exploring Polar Microbiomes as Source of Bioactive Molecules

Chapter 2 – Biodiversity and bioactive potential of the McMurdo Dry Valleys, Antarctica

FCUP 28 Exploring Polar Microbiomes as Source of Bioactive Molecules

I – Background

NPs are still the major contributors for the development of new therapies and, together with NP derivatives, account for over 50% 76 of currently used drugs. Microbial NPs, mostly secondary metabolites, are produced by microorganisms with a variety of purposes related to protection and survival strategies (e.g. antipredatory, antibiotic, photoprotective) 77. Some microbial taxa are particularly prolific: among Bacteria, the phyla Actinobacteria, that accounts for two-thirds of currently administered naturally derived antibiotics 78 and Cyanobacteria 79 are well-known examples of bioactive chemically-rich microorganisms. Eukaryotic microorganisms such as Fungi from the Ascomycota phylum 80 are also privileged sources of chemodiversity. A decreasing rate of discovery and high incidence of rediscovery of known molecules 25, in part due to the repetitive isolation of known strains and the inability to bring new strains into culture, has emerged during the last few decades and has pushed efforts to revitalize bioprospection strategies. The development of new culture methods to improve culturability (e.g. pre- treatment strategies 81) as well as the exploration of untapped ecosystems have been pointed out as valid strategies to deal with such problems 82. The prospection of less exploited ecosystems, as are extreme environments like the deep sea 83 and hot and cold deserts, has revealed an unexpected diversity of (novel) microorganisms and metabolites 54. In polar habitats, including Antarctica, the exposure to extreme low temperatures, low nutrients and high UV radiation implies the development of biochemical and physiological strategies for survival, including the production of novel chemical entities 54. Despite the various reports concerning Antarctic microorganisms, they are primarily focused on biodiversity and phylogenetic analysis, and only recently in the screening for bioactive compound production potential 84,85. Recent explorations and bioassay-guided isolation of compounds from Antarctic microorganisms have yielded an array of interesting new metabolites (Table 5), including the antibacterial terpenoid 4, obtained from the Antarctic cyanobacterium Nostoc CCC 537, active against three multi-resistant strains of E.coli 86 .

FCUP 29 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 5 - Example of new bioactive molecules retrieved from Antarctic microorganisms. The respective structures are depicted below on figure Figure 7.

Phylum Compound(s) Microorganism Isolation Bioactivity Reference source Frigocyclinone Streptomyces Antibacterial 87 Actinobacteria (1) griseus strain NTK 97 Gephyromycin Streptomyces Soil sample Neuroprotective 88 (2) griseus strain NTK 14 q Microbiaeratin Microbispora Penguin Antiproliferative 89 (3) aerata strain excrement and Cytotoxic IMBAS-11ª Cyanobacteria (4) Nostoc CCC 537 Antibacterial 86 Fungi Geomycins B– Geomyces sp. Soil sample Antibacterial 90 C (5–6) Chetracin C-D Oidiodendron Soil sample Cytotoxic 9186 (7-8) truncatum GW3- 13 Penilactone A- Penicillium Deep-sea NF-κB inhibitory 92 B (9-10) crustosum PRB-2 sediment

Figure 7 –Structures of molecules produced by Antarctic Microorganisms, presented on Table 5. FCUP 30 Exploring Polar Microbiomes as Source of Bioactive Molecules

II – Goals

With the overall goal of bioprospecting the Victoria Valley system in Antarctica the largest of the McMurdo Dry Valleys, with an ice-free area of approximately 650 km2, for bioactive small-molecules, and using environmental samples from a rock with endolithic colonization (i.e. within the pore space of rocks 93) and from a soil transect with decreasing water availability, we focused on the following specific objectives: 1) to guide the isolation of specific chemically-prolific taxa with the aid of the taxonomic composition data obtained from 16S rRNA gene analysis; 2) to use different culture strategies, including pre-treatments, to maximize the recovery of biodiversity, including novel strains, from the Antarctica samples; 3) to screen the bioactive potential of the obtained isolates in a panel of diverse, relevant bioassays.

FCUP 31 Exploring Polar Microbiomes as Source of Bioactive Molecules

III – Materials and Methods

1–Biodiversity of a Soil Transect and of a Rock with endolithic colonization in Victoria Valley, Victoria Land, Antarctica

1.1 - Sample collection Substrate from a rock with endolithic colonization (END) and from a soil transect with decreasing water availability (from T1; 77º 20.241’S, 161º 38.593’E to T6; 77º 20.232’S, 161º 38.526’E) were collected in Victoria Valley during the K020 Mission in January 2013, on behalf of the NITROEXTREM project (http://www.cmagalhaes.com/project- publications?item=3095-nitroextrem), integrated in the ICTAR international program (ICTAR–www.ictar.aq). The Victoria Valley is located in McMurdo Dry Valleys region, in South Victoria Land, Antarctica. The samples were preserved at -80ºC in LifeGuard Solution (MoBio).

A

B C D

Figure 8 – (A) Location of sampling points in Victoria Valley (marked in red). Coordinates: Soil sample T1 (S77 20.241; E161 38.593), T2 (S77 20.240, E161 38.584), T3 (S77 20.238; E161 38.578), T4 (S77 20.237, E161 38.565), T5 (S77 20.235; E161 38.547), T6 (S77 20.232; E161 38.526); and endolithic (END) (S77 47.278,E161 73.500).The map was generated using Google Earth version 7.3.0. (B) Landscape in McMurdo Dry Valleys. (C) Sandstone rock with cryptoendolithic colonization. (D) Soil samples collected in Victoria Valley.

FCUP 32 Exploring Polar Microbiomes as Source of Bioactive Molecules

1.2 - eDNA Extraction and 16S rRNA gene sequencing Environmental DNA (eDNA) had been previously extracted from the Victoria Valley samples by the NITROEXTREM project team using a modification of the CTAB extraction protocol 94. The 16S rRNA gene was initially amplified by PCR using the universal primer pair 27F/1492R 95 and then sequenced by pyrosequencing technology. Briefly, the 16S rRNA gene was amplified for the V3/V4 hypervariable region with barcoded fusion primers containing the Roche-454 A and B Titanium sequencing adapters, an eight-base barcode sequence, the forward (5’– ACTCCTACGGGAGGCAG-3’) and reverse (5’– TACNVRRGTHTCTAATYC -3’) primers 96. The PCR reaction was performed using 5 U of Advantage Taq polymerase (Clontech), 0.2 µM of each primer, 0.2mM dNTPs, 6% DMSO and 2-3 µL of template DNA. The PCR conditions employed were: initial denaturation step at 94ºC for 3 min, followed by 25 cycles of 94 ºC for 30 s, 44 ºC for 45 s and 68 ºC for 60 s. The final elongation step at 68 ºC for 10 min. The amplicons were quantified by fluorimetry with PicoGreen (Invitrogen), pooled at equimolar concentrations and sequenced in the A direction with GS 454 FLX Titanium chemistry, according to the manufacturer’s instructions (Roche, 454 Life Sciences) at Biocant (Cantanhede, Portugal).

1.2.1- QIIME Analysis of the 16S rRNA gene The 454-machine-generated FASTA (.fna) and quality score (.qual) files were processed using the QIIME (Quantitative insights into microbial ecology) pipeline. QIIME is an open- source software pipeline for analysis and comparison of microbial communities which provides a wide range of microbial community analysis and visualizations 97. The online available tutorial “454 Overview Tutorial” (http://qiime.org/tutorials/tutorial.html) was followed. Initially a mapping file was created, containing the information necessary to perform the analysis, including the specific barcode sequence for each sample. The first step, demultiplexing and quality filter was executed by the default split_libraries.py script, in which the multiplexed reads are assigned to the respective samples based on the barcode sequence and low quality or ambiguous reads are removed. The next step, Pick OTUs (Operational Taxonomic Units) 98 was performed in parallel with 3 different workflows: pick_de_novos_otus.py, pick_otus.py (closed-reference method) and pick_open_reference_otus.py (open-reference method).The OTU table obtained from the open-reference method was selected to the downstream analyses. Essentially, all sequences were clustered into OTUS at 97% sequence similarity using UCLUST 99 and the reads aligned to the Greengenes v13_8 (GG) 100 database using PyNAST. For the taxonomic assignment, the RDP Classifier 2.2 101 was used using the UCLUST method. FCUP 33 Exploring Polar Microbiomes as Source of Bioactive Molecules

The taxonomic composition of the communities was summarized at different taxonomic levels using the Sumarize_taxa_through_plots.py workflow. For each sample, alpha diversity was computed on rarefied OTU tables using alpha_rarefaction.py workflow. The lower number of sequences per sample (2910) was used for rarefaction The Chao1 parameter 102, observed OTUs and phylogenetic diversity metrics were calculated. Beta diversity was calculated by the beta_diversity_through_plots.py workflow using weighted and unweighted UniFrac metrics 103.

1.3 - Prediction of the microbiome metabolic capacity using PICRUSt PICRUSt (phylogenetic investigation of communities by reconstruction of unobserved states), a software able to computationally extrapolate the functional composition of a microbial community’s metagenome from marker gene data (16S rRNA gene) was used with the 16S rRNA gene data 104. The OTU table produced in QIIME for the closed- reference method, using Greengenes database was used as input in PICRUST together with the mapping file. The OTU table has been normalized based on copy number, and used to predict the functional traits using KEGG Orthologs (KOs), categorized by function (KEGG pathways).

2- Isolation of microorganisms from a soil transect and endolithic sample from Victoria Valley

The environmental samples collected from Victoria Valley were preserved at -80ºC in LifeGuard Solution (MoBio). From these, the samples for bacterial isolation were selected based on the taxonomic composition (16S rRNA data from eDNA) and the available amount of sample. Sample T5 and T6 were selected to isolate Actinobacteria. It is known that the cultivable fraction of the microbial richness is typically below 1% 45, so in order to improve the cultivability and maximise the recovery of microbial biodiversity from the samples, different culture methods strategies – including pre-treatments – were employed.

2.1 – Culture strategies –Soil samples T5 and T6 For sample T6, 0.5 grams of the original sample (soil) was weighted under sterile conditions and 5 mL of sterile saline solution (0.85% NaCl) was added to resuspend the substrate. The solution was vortexed for 10 min and allowed to settle for 2 min. Sequential dilutions (down to a dilution factor of 10-2) of the supernatant were performed, inoculated (in duplicate) on solid media and incubated at three different temperatures (4, FCUP 34 Exploring Polar Microbiomes as Source of Bioactive Molecules

9 and 19 ºC). All the media were supplemented with 5 ppm of cycloheximide (BioChemica) and streptomycin (BioChemica). Initially, the dilutions were plated onto an oligotrophic medium – Nutrient-poor sediment extract (NPS) - primarily made with an extract from the original sample substrate and then with beach sand, to simulate the environmental oligotrophic conditions. Previous works have indicated that soil-extract agar is able to retrieve a wider and more diverse range of biodiversity when compared to traditional media 105. Briefly, ca. 500 g of substrate was mixed with 500 mL of distilled water, homogenised and allowed to settle. For medium preparation, 100 mL of the supernatant solution was mixed with 900 mL of distilled water and 17 g of bacteriological agar. Furthermore, obtained colonies were streaked also to the following richer media: modified nutrient-poor sediment extract (MNPS) – 5 g/L soluble starch, 1 g/L KNO3, 100 mL/L substrate extract and 17 g/L agar, ISP2 106, and raffinose histidine (10 g/L raffinose pentahydrate, 1 g/L L-histidine, 1 g/L dipotassium phosphate, 0.5 g/L magnesium sulfate heptahydrate, 0.01 g/L iron (II) sulfate heptahydrate and 16 g/L bacteriological agar). Bacterial colonies were successively streaked until pure colonies were achieved. In order to select the sporulating organisms (i.e able to resist to extreme conditions as high temperatures) and to select Gram-positive bacteria (Actinobacteria), sample T5 was submitted to two different pre-treatments: 1) Heat-shock and 2) antibiotics incubation 107. Initially, 0.5 g of sample (soil) was weighted under sterile conditions and 2.5 mL of sterile saline solution (0.85% NaCl) was added to resuspend the substrate. The sample was placed on ultrasounds for one min and vortexed for five min. For pre-treatment 1) one mililliter of the suspension was incubated at 50 ºC for 5 min. For the second strategy, 1 mL of the suspension was incubated with 20 ppm of Streptomycin (BioChemica) and Nalidixic acid (BioChemica), at 28 ºC for 30 min. For each treatment, serial dilutions (down to a dilution factor of 10-2) were performed and each plated onto different selective media for Actinobacteria: AIA (sodium caseinate 2 g/L, L-asparagine, 0.1 g/L, aodium propionate 4 g/L, dipotassium phosphate, Czapeck agar (adapted from 108) and SCN (Starch soluble 10 g/L, Casein sodium salt from bovine milk 0.3 g/L, potassium phosphate dibasic trihydrate 2.62g/L, potassium nitrate 2 g/L, sodium chloride 2 g/L, magnesium sulfate heptahydrate 0.05g/L, calcium carbonate 0.02g/L, iron(II)sulfate heptahydrate 0.01 g/L, and incubated at 4 ºC and 28 ºC. The remaining substrate (prior to pre-treatments) was also plated under the same conditions. The bacterial colonies found in the plates were streaked in the same isolation media until pure colonies were obtained. Biomass from pure isolates was collected in cryopreservation tubes, each with 800 µL of saline solution and 220 µL of glycerol solution. The tubes were stored at -80ºC at the Ecobiotec Culture Collection (at CIIMAR, University of Porto). Some strains (those FCUP 35 Exploring Polar Microbiomes as Source of Bioactive Molecules growing poorly) were also cryopreserved in CRYOINSTANT Mixed cryotubes (VWR Chemicals).

2.2- Identification of Bacterial and Fungi Isolates through 16S rRNA and ITS gene amplification and Phylogenetic analysis

2.2.1 - Identification of bacteria and Fungl isolates using FTA Indicating Micro cards (WhatmanT) For bacterial strains and fungi strains exhibiting poor growth under the tested conditions, a protocol using Whatman™ FTA™ Indicating Micro Cards was performed to achieve the identification of the isolates. One single colony was picked from solid culture media and suspended in 60 µL of TE Buffer. The suspension was transferred to the Whatman™ FTA™ card, and dried completely at room temperature in the flow chamber. The remaining steps were performed at STABVIDA (Portugal). Briefly, the DNA was purified through a 2.0 mm disc with the aid of FTA Purification Reagent (Whatman), the 16S rRNA, ITS and D1/D2 regions of rDNA were amplified using universal primers: 27F/800R + 518F/1492R and ITS1/ITS4 + NL1/NL4, respectively, with 2 U of Surf Hot Taq DNA polymerase (STABVida) in a SureCycler 8800 (Agilent, USA) thermocycler. The PCR products were purified using the commercial kit Mag-Bind PCR Clean Up 96 kit (Omega bio-tek). The purified PCR products were sequenced using the big dye terminator sequencing kit v3.1 (Applied Biosystems) in a 3730XL DNA analyzer (Applied Biosystems) sequencer. The universal primers 27F, 518F, 800R e 1492R were used. Obtained consensus sequences were submitted to NCBI to a blast(n) search against the 16S rRNA gene sequences (Bacteria and Archaea) database or against the nucleotide collection (ITS and D1/D2 sequences).

2.2.2 – Identification of bacterial isolates through 16S rRNA gene amplification

2.2.2.1 – DNA extraction After isolation of pure cultures, each bacterial isolate was transferred to 10 mL of liquid medium in 50 mL falcons until enough biomass was obtained to extract DNA. The DNA was then extracted using the E.Z.N.A.® Bacterial DNA Kit (OMEGA bio-tek). The manufacturer’s instructions were followed and DNA eluted in a final volume of 100 µL elution buffer. The integrity of the gDNA was assessed by agarose gel electrophoresis (0.8% agarose gel prepared in TAE buffer 1x. and stained with 1 µL of SYBR® Safe DNA FCUP 36 Exploring Polar Microbiomes as Source of Bioactive Molecules

Gel Stain (ThermoFisher Scientific)). One microliter of DNA (with loading dye) was loaded onto each lane before electrophoresis at 80 V for 30 min.

2.2.2.2 – PCR Amplification of the 16S rRNA gene The 16S rRNA gene was amplified by PCR using primer pair 27F/1492R 109 (1465 bp) in a Veriti® 96-Well Thermal Cycler (ThermoFisher Scientific). The PCR reaction was prepared in a volume of 10 μL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM dNTPS (TAKARA BIO INC), 1.5 μL of each primer (2 μM), 0.25 mg/mL of UltraPureTM BSA (Life technologies), 0.25 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 1 μL of template DNA. The PCR conditions were: initial denaturation step at 98 ºC for 2 min, followed by 30 cycles of a denaturation step at 94 ºC for 30 s, annealing at 48 ºC for 90 s and extension at 72 ºC for 2 min, followed by a final extension step at 72ºC for 10 min. PCR products (3 μL loaded in each well) were separated by electrophoresis on a 1.5% (w:v) agarose gel during 30 min at 150 V. The ladder utilized was GRS ladder 1kb (Grisp). The gel was stained with 1 μL SYBR® Safe DNA Gel Stain (ThermoFisher Scientific), visualized under UV-light at Gel Doc XR+ System (BIO-RAD) and analysed with the Image Lab™ software (BIO-RAD). The PCR products were then sequenced by Sanger sequencing at i3S (Porto,Portugal). Briefly, the PCR products were purified with ExoSAP-IT® Express (Affymetrix) and sequenced with the following components: igDye® terminator v3.1 cycle sequencing kit [Applied Biosystems]; bigDye® terminator v1.1, v3.1 5x sequencing buffer [Applied Biosystems]; primer 10 μM; nuclease-free water (Ambion); purified PCR product. The sequencing products were purified with illustra™ Sephadex™ G-50 Fine DNA grade and submitted to an automated capillary electrophoresis on 3130xl Genetic analyzer sequencer (Applied Biosystems). Visual quality control of the electropherograms was performed in sequence Scanner v1.0 (Applied Biosystems). Raw forward and reverse sequences (ab1 files) were imported into Geneious 8.1.9 66 for de novo assembling. Obtained consensus sequences (average length 1200 bp) were submitted to NCBI to a blast(n) search against the 16S rRNA sequences (Bacteria and Archaea) database.

2.3– Phylogenetic analysis The obtained 16S rRNA gene (or ITS) sequences were submitted to a blast(n) analysis agasint the NCBI 16S rRNA sequences (Bacteria and Archaea) (or Nucleotide collection) database and the sequences from the first ten blast(n) matches were retrieved. The multiple sequence alignment (using the ClustalW algorithm) and the phylogenetic analysis were performed in MEGA7 110. The alignments were manually curated to FCUP 37 Exploring Polar Microbiomes as Source of Bioactive Molecules remove short sequences and gap regions. The phylogenetic trees were reconstructed using the Maximum Likelihood statistical method. Bootstrap (with 1000 replications) and

Tamura-Nei 111 substitution Model were used.

3- Screening by PCR of PKS and NRPS genes in bacterial isolates

For the PCR Amplification of KS and AD domains of PKS and NRPS genes (respectively), four sets of primers were used: degK2F.i/degK2R.i 59 and DKF/DKR 34 for PKS, A3F/A7R 33 and MTF/MTR 35 for NRPS (Table 6). The thermal cycling was performed at Veriti® 96-Well Thermal Cycler (ThermoFisher Scientific). The PCR reaction was prepared in a volume of 20 μL containing 1× TaKaRA PCR Buffer (TAKARA BIO INC), 1.5 mM MgCl2 (TAKARA BIO INC), 250 μM DNTPS (TAKARA BIO INC), 0.625 μL of primer A3/A7R and degK2F.i/deK2R.i (100 μM) and 1 μM of primer MTF/MTR and DKF/DKR, 0.25 mg/mL of UltraPureTM BSA (Life technologies), 0.5 U TaKaRa Taq™ Hot Start Version (TAKARA BIO INC) and 2 μL of template DNA. The PCR conditions executed were: initial denaturation step at 95 ºC for 4 min, followed by 40 cycles of a denaturation step at 94 ºC for 30 seconds, annealing at 67.5 ºC for 30 s and extension at 72 ºC for 60 s, followed by a final extension step at 72 ºC for 5 min, for amplification of AD domain using primer pair A3F/A7R. For primer pair degK2F.i/degK2R.i the conditions executed were: initial denaturation step at 95 ºC for 4 min, followed by 40 cycles of a denaturation step at 94 ºC for 40 s, annealing at 56.3 ºC for 40 s and extension at 72 ºC for 75 s, followed by a final extension step at 72 ºC for 5 min. For primer pair DKF/DKR, the initial denaturation step occurred at 95 ºC for 4 min, followed by 30 cycles of a denaturation step at 94 ºC for 30 s, annealing at 55 ºC for 30 s and extension at 72 ºC for 60 s, followed by a final extension step at 72 ºC for 5 min. Using primer pair MTF/MTR, an initial denaturation step at 95 ºC for 4 min, followed by 35 cycles at 94 ºC for 10 s, annealing at 52 ºC for 20 s and extension at 72 ºC for 1 min and a final extension step at 72 ºC for 7 min were performed. PCR products (10 μL loaded onto each well) were separated by electrophoresis on a 1.5% (w:v) agarose gel during 40 min at 120 V, together with 5 μL of GRS ladder 1kb (Grisp). Gel was stained with 1 μl SYBR® Safe DNA Gel Stain (ThermoFisher Scientific), visualized under UV-light at Gel Doc XR+ System (BIO-RAD) and analysed with Image Lab™ software (Bio-Rad).

FCUP 38 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 6 – List of primers used in this work.

Primer name Sequence (5’-3’) Reference

27F GAGTTTGATCCTGGCTCAG 109

1492R TACGGYTACCTTGTTACGACTT 109

518F ATTACCGCGGCTGCTGG 112

ITS1 TCCGTAGGTGAACCTGCGG 113

ITS4 TCCTCCGCTTATTGATATGC 113

NL1 GCATATCAATAAGCGGAGGAAAAG 114

NL4 GGTCCGTGTTTCAAGACGG 114

MTF GCNGG(C/T)GG(C/T)GCNTA(C/T)GTNCC 35

MTR CCNCG(AGT)AT(TC)TTNAC(T/C)TG 35

DKF GTGCCGGTNCC(AG)TG(GATC)G(TC)(TC)TC 34

DKR GCGATGGA(TC)CCNCA(AG)CA(AG)(CA)G 34

A3F GCSTACSYSATSTACACSTCSGG 33

A7R SASGTCVCCSGTSCGGTAS 33

degK2F.i GCIATGGAYCCICARCARMGIVT 59

degK2R.i GTICCIGTICCRTGISCYTCIAC 59

4 - Preparation of organic extracts for Bioactivity-Guided Isolation of Bioactive Molecules

4.1 – Organic extraction for Bioactivity Screenings Isolated pure cultures on solid medium, were inoculated in 100 mL of liquid medium on 250mL Erlemeyer flasks and incubated at 26 ºC with constant agitation (100 rpm). Once the exponential phase was reached, 1.5 g of resin Amberlite® XAD16N20-60 mesh (SIGMA-ALDRICH) was added to the culture. After one week of growth, the biomass and resin were centrifuged at 4500 g for 10 min, washed twice with dH2O, and freeze- dried. The freeze-dried biomass and resin were then extracted with a 1:1 mixture of acetone and methanol. The biomass was initially immersed in the mixture (100 mL) for 30 min with constant agitation, then centrifuged (4500 g for 10 min) and the liquid phase collected in a round-bottom (RB) flask after passage through a Whatman No1 filter paper. FCUP 39 Exploring Polar Microbiomes as Source of Bioactive Molecules

The process was repeated twice, the extract dried in a rotary evaporator, transferred to a vial and dried in vacuo.

4.2 - Organic extraction (methanol and fractionation (VLC) from the Penicillium citrinum strain 31 Penicillium citrinum strain 31, isolated from the environmental sample T6, was selected for large-scale cultivation due to its fast growth and because this genus is a well-known secondary metabolite producer 115. Large-scale cultures were performed in ISP2 liquid medium, at 26 ºC with constant agitation (100 rpm) for 20 days in 1 L flasks. The biomass was then harvested, freeze-dried and maintained at -20 ºC. After 80 L of cultivation, 22 g of freeze-dried biomass were obtained to proceed to the organic extraction and fractionation.

4.2.1 – Organic extraction The extraction apparatus was prepared by assembling a Büchner funnel with a Whatman No 1 filter paper and cheese cloth, a vacuum adapter and a RB flask. The freeze-dried fungi biomass (21.9 g, d.w.) was initially macerated with a spatula and immersed in a sufficient volume of methanol that entirely immersed the biomass, for 20 min, in a beaker and with occasional stirring. The solvent content was then poured into the Büchner funnel and the extract was collected in a RB flask. Before starting a new iteration of the extraction, the biomass that was retained in the cheese cloth was recovered into the beaker. This process was repeated seven times, with methanol, each for 20 min. The last four iterations were carried out at 40 ºC on a hotplate, with constant stirring. After extraction, the RB flask was removed from the assembly and the solvents were removed in a rotary evaporator. The content of the RB flask was then dissolved in methanol and filtered through cotton wool in a funnel to retain any remaining cells. The filtered extract was then transferred to three pre-weighed glass vials of 20 mL and dried under vacuum to yield 4.41 g of crude extract.

4.2.2 – Fractionation of the organic extract The crude extract (4.41 g) was fractionated by normal phase (silica gel 60, 0.0015-0.040 mm, Merck) vacuum liquid chromatography (VLC), using a gradient of solvents from the non-polar hexanes to ethyl acetate to methanol (as described in Table 7). The fractionation apparatus was prepared accordingly to the Figure 9 (A), using a Synthware Chromatography Column. The crude extract was applied to the column by dry loading: the extract was dissolved in methanol and transferred to a RB flask, silica was added and the mixture dried in a rotary evaporator. The dried extract adsorbed to the silica was FCUP 40 Exploring Polar Microbiomes as Source of Bioactive Molecules then loaded onto the top of the silica column. The different solvent mixtures were added sequentially to the column (as described in Table 7), without letting the silica surface become exposed to air. Twelve fractions (A - L) were obtained, in order of increasing polarity, collected in RB flasks and dried in a rotary evaporator. The fractions were then ressuspended with dichloromethane and ethyl acetate (fraction A–H) and methanol (fraction I-L), filtered through cotton wool (Figure 9B), transferred to vials, dried in vacuo and stored at -80 ºC until further use.

Table 7 - Solvent mixtures (eluents) utilized in the fractionation of the crude extract from Penicillium citrinum strain 31, namely ethyl acetate (EtOAc), hexane (hex) and methanol (MeOH).

Fraction Solvent mixture Volume A B A 50% EtOAc (hex) 300 mL

B 70% EtOAc (hex) 300 mL

C 90% EtOAc (hex) 300 mL

D 100% EtOAc 300 mL

E 5% MeOH (EtOAc) 300 mL

F 10% MeOH (EtOAc) 300 mL G 20% MeOH (EtOAc) 300 mL

H 20% MeOH (EtOAc) 300 mL

I 50% MeOH (EtOAc) 300 mL

J 70% MeOH (EtOAc) 300 mL

K 90% MeOH (EtOAc) 300 mL Figure 9 –(A) Fractionation apparatus (B) – Filtration of fraction L 100% MeOH 300 mL though cotton. (4x)

4.2.2.1 - Flash-chromatography of fraction 31 B The active fraction 31B was further fractionated by normal phase flash chromatography. Silica gel (0.040-0.063 mm) was solvated with the initial elution solvent mixture: 10% ethyl acetate:90% hexane. The sample was resuspended in the same solution and added to the top of the silica layer, followed by addition of sand to protect the silica surface from the impact of solvents addition. Elution was performed using a gradient of increasing polarity from 20% EtOAc(hex) to 25% (MeOH):EtOAc to 100% MeOH (Table 8). TLC was used to analyse the composition of the collected subfractions (mobile phase: 30% EtOAc (hex)), which were pooled based on the resulting profile, originating 9 sub-fractions. FCUP 41 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 8 - Solvent mixtures used for elution on Flash-Chromatography of fraction 31B. Subfraction Solvent Mixture Volume

31B-1 20% EtOAc (hex) 30mL

31B-2 30% EtOAc (hex) 30 mL

31B-3 30% EtOAc (hex) 30 mL

31B-4 30% EtOAc (hex) 300 mL

31B-5 30% EtOAc (hex) 20 mL 31B-6 40% EtOAc (hex) 50 mL

31B-7 40% EtOAc (hex) 150 mL

31B-8 50% EtOAc (hex) 100 mL

70% EtOAc (hex) 100 mL

80% EtOAc (hex) 100 mL

25% MeOH (EtOAc) 100 mL

100 mL 50% MeOH (EtOAc) 31B-9 100% MeOH (EtOAc) 300 mL

5 - Bioassays

The organic extracts, fractions and subfractions were dissolved in DMSO, to obtain stock solutions with final concentrations of 3 mg mL-1 and 1 mg mL-1 to be tested in a series of pharmacologically-relevant bioassays.

5.1 - Antimicrobial screening susceptibility assay The solutions of 1 mg mL-1 were tested against two Gram-positive bacterial strains (Staphylococcus aureus ATCC 29213 and Bacillus subtilis ATCC 6633), two Gram- negative bacterial strains (Escherichia coli ATCC 25922 and typhimurium ATCC 25241) and against a yeast strain (Candida albicans ATCC 10231). The bacteria were grown in Mueller-Hinton agar (MH – BioKar diagnostics) from stock cultures, and incubated at 37 ºC. The yeast was grown in Sabouraud Dextrose Agar (BioKar diagnostics). For the antibacterial screening, a method of disc diffusion was carried out. Bacterial pure colonies were picked from overnight cultures in MH (with a swab) and suspended in LB liquid medium, the turbidity adjusted to 0.5 McFarland standard and FCUP 42 Exploring Polar Microbiomes as Source of Bioactive Molecules the MH plates seeded with the resulting inoculum. Blank discs (6 mm in diameter - Oxoid) were placed in the inoculated plates and impregnated with 15 μL of a 1 mg mL-1 solution of each organic extract. The plates were left for 30 min at room temperature and then incubated overnight at 37 ºC. After 24 h, the plates were checked for inhibition halos, indicative of antimicrobial activity. The diameter of inhibition halos was recorded.

5.2 - MTT Assay The obtained extracts were tested against two cancer cell lines: breast ductal carcinoma (T-47D) and neurobastoma (SH-SY5Y), both from Sigma-Aldrich. Cell lines were cultivated in Dubelco's Modified Eagle Medium (DMEM) from Gibco (Thermo Fischer Scientific) supplemented with 10% (v/v) fetal bovine serum (Biochrom), 1% (v/v) Penincillin/Streptomycin (Biochrom) at 100 IU/mL and 10 mg mL-1, respectively, and 0.1% (v/v) amphotericin (GE Healthcare). The cellular viability was evaluated by the reduction of the 3- (4.5 dimethylthiazole- 2-yl)-2.5-diphenyltetrazolium bromide (MTT).

Cells were incubated in a humidified atmosphere with 5% of CO2, at 37 ºC. The cell lines were seeded in 96-well culture plates at the density of 6.6 x 104 cells mL-1. After 24 h of adhesion in 100 μl of medium, the cells were incubated in new medium with 0.5% of the organic extract at 3mg mL-1, 0.5% and 20% of DMSO, as the negative and positive control, respectively. Cell viability was assessed at 24 and 48 h by the addition of MTT at a final concentration of 0.2 mg mL-1 incubated for 4 h, at 37 ºC. Following exposure, the purple-coloured formazan salts were dissolved in 100 µL DMSO and the absorbance measured at 570 nm in a microplate reader (Biotek). Cellular viability was expressed as a percentage relative to the negative control and the assays performed in triplicate.

FCUP 43 Exploring Polar Microbiomes as Source of Bioactive Molecules

IV-Results

1 - Biodiversity of a soil transect and of a rock with endolithic colonization in Victoria Valley, Victoria Land, Antarctica

A total of 71447 raw reads were obtained by 454 pyrosequencing for the seven samples studied, and after the quality filtering step decreased to 65625. The number of sequences per sample ranged between 2910 and 17570.

1.1 - Alpha-diversity Alpha diversity 116 was measured by chao1 102, phylogenetic diversity (PD) and number of observed OTUS metrics. Rarefactions plots (α-diversity vs sequencing efforts) were created for each measure (Figure 10). Concerning the number of observed OTUS, as expected, the endolithic bacterial communities are much less diverse than the soil transect communities (Figure 10-C). Ordered by increasing level of diversity, at 2910 sequences, the number of observed OTUs obtained was of 166, 314, 364, 571, 736, 765 and 784 for sample END, T3, T5,

T4, T2, T1 and T6, respectively (Supplementary Table S 5). Chao 1, a species-based measure of α-diversity was used to assess the sequencing depth. From the Figure 10 - A, it is clear that, at this depth, all the transect (T1-T6) samples were not deeply sequenced, as a plateau was not fully achieved. Otherwise, for END sample, the saturation was attained. Concerning PD, a taxon richness measure that includes a phylogenetic component, sample T6, T1 and T2 are presented as phylogenetically more diverse and END as the less. The rarefaction curves follow a similar pattern as the determined using chao1. In accordance to the present results, bacterial diversity was not fully covered from the sequencing effort in transect samples, but in endolithic sample it was.

FCUP 44 Exploring Polar Microbiomes as Source of Bioactive Molecules

A B

C

Figure 10 – Rarefaction curves for alpha-diversity metrics. (A)- chao1; (B)-Phylogenetic diversity; (C)-observed OTUs.

1.2 – Beta-diversity Two different beta-diversity metrics were employed – weighted and unweighted UniFrac 103, both phylogeny-based. The resultant output was summarized by Principal Coordinates Analysis (PCoA) – Figure 11. The principal coordinate 1 (PC1) explains 29.16 and 44.6% of the amount of variation for unweighted and weighted, respectively. From both plots, it is observable a similar clustering pattern – samples T1 and T2 cluster together, as well as samples T4, T5 and T6. Further, sample T3 and END seem to cluster independently.

FCUP 45 Exploring Polar Microbiomes as Source of Bioactive Molecules

A PC2 (20.23%) T5 T6

T4

END T3

T2 T1 PC1 (29.16%) PC3 (15.88%)

B PC2 (23.9%)

END T5 T6

T6

T1 T2 T3 PC1 (44.6%) PC3 (17.81%)

Figure 11 –PcoA plots using the unweighted (A) and weighted (B) UniFrac metrics.

1.3. – Taxonomic composition Figure 12 depicts the summary chart of the taxonomic composition at the phylum level. A total of 19 bacterial phyla were detected across all the samples, with phyla Actinobacteria (32.4%), Proteobacteria (19.2%), Cyanobacteria (14.9%) and Bacteroidetes (12.1%) being the most abundant and present in all samples. In this study, the distribution of Actinobacteria and Cyanobacteria - two of the most chemically prolific phyla, was given special attention. As illustrated in Figure 12-B, the endolithic sample is mostly composed of Cyanobacteria (44%) and Actinobacteria (30.3%). Regrading the soil transect, the percentage of Cyanobacteria is highest on the first three samples (around 20%) and neglectable on the last. Parallelly, it is possible to observe that the percentage of Actinobacteria tend to increase while Cyanobacteria percentage decrease.

FCUP 46 Exploring Polar Microbiomes as Source of Bioactive Molecules

A

B

Total END T1 T2 T3 T4 T5 T6 Legend Taxonomy % % % % % % % % Unassigned 0.6% 0.1% 1.1% 0.7% 1.0% 0.6% 0.3% 0.7% k__Bacteria;p__Acidobacteria 9.1% 1.5% 11.0% 8.2% 1.1% 13.4% 10.2% 18.5% k__Bacteria;p__Actinobacteria 32.4% 30.3% 15.9% 24.2% 17.2% 46.6% 54.8% 37.8% k__Bacteria;p__Armatimonadetes 0.0% 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__BRC1 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.0% k__Bacteria;p__Bacteroidetes 12.1% 3.8% 10.2% 10.4% 34.5% 5.9% 13.2% 6.8% k__Bacteria;p__Chlorobi 0.2% 0.0% 0.7% 0.4% 0.0% 0.2% 0.0% 0.2% k__Bacteria;p__Chloroflexi 4.2% 9.6% 4.0% 4.4% 0.9% 3.9% 2.7% 3.8% k__Bacteria;p__Cyanobacteria 14.9% 44.7% 23.3% 15.1% 20.0% 0.9% 0.3% 0.4% k__Bacteria;p__Elusimicrobia 0.0% 0.0% 0.2% 0.1% 0.0% 0.0% 0.0% 0.0% k__Bacteria;p__FBP 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.0% k__Bacteria;p__Firmicutes 0.1% 0.0% 0.0% 0.3% 0.0% 0.0% 0.0% 0.2% k__Bacteria;p__Gemmatimonadetes 3.2% 1.2% 3.9% 6.3% 2.0% 4.0% 3.4% 1.8% k__Bacteria;p__Nitrospirae 0.2% 0.0% 0.6% 0.1% 0.0% 0.0% 0.0% 0.6% k__Bacteria;p__Planctomycetes 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__Proteobacteria 19.2% 5.8% 27.2% 24.6% 19.6% 18.7% 13.9% 24.7% k__Bacteria;p__Spirochaetes 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__TM7 1.2% 0.0% 0.3% 2.6% 1.2% 2.7% 0.4% 1.0% k__Bacteria;p__Verrucomicrobia 1.4% 0.2% 1.4% 2.1% 0.6% 2.9% 0.1% 2.7% k__Bacteria;p__[Thermi] 0.8% 2.7% 0.0% 0.1% 1.6% 0.1% 0.6% 0.3%

Figure 12 - (A) Bar chart of frequency of phyla-affiliated OTUs per sampling point taxonomy summary. (B) Summary table of taxonomic frequency distributions at Phylum level. The phyla with 0% of distribution were eliminated from the table. FCUP 47 Exploring Polar Microbiomes as Source of Bioactive Molecules

The last three samples harbour a higher abundance of Actinobacteria, 46.6, 54.8 and 37.8%, respectively. Concerning the lower taxonomic levels Actinobacteria phylum are mainly distributed by three classes: Acidimicrobiia, Actinobacteria and Thermoleophilia and four families: Sporichthyaceae (7.8%), Patulibacteraceae (2.7%), Nocardioidaceae

(1.5%), Gaiellaceae (1.2%). Concerning genus (Supplementary Table S 7) the ones possible to reach, two are more abundant – Euzebya (3%) and Rubrobacter (1.7%). It is also possible to observe that some taxonomic groups are present in specific samples. Nocardioidaceae and Patulibacteraceae families are present essentially on the last three samples (T4-T6). Further, the genus Rubrobacter is present on END (5.5%) sample and on the last three samples (around 2%). The genus Euzebya is present mainly on END (15.4%) and T3 (3.9%) samples. Similarly, the family Intrasporangiacea is also exclusively present on these samples (2.1% on END and 1.3% on T3). Interestingly, END is the only sample harbouring bacteria from the Streptomycetaceae family (0.7%).

Concerning the Cyanobacteria phylum (Supplementary Table S 6), four families are overrepresented - Acaryochloridaceae (6.3%), Pseudanabaenaceae (2.6%) and Phormidiaceae (1.9%). The most abundant genera are Acaryochloris (6.3%) exclusively found in END (44.2%), Leptolyngbya, mainly present on sample T1 (2.5%) and Phormidium (1.9%), mainly present on sample T3 (11%). Additionally, it is possible to verify that the abundance of Acidobacteria, seems to follow a distribution pattern similar to those of Cyanobacteria and Actinobacteria, with a decreasing percentage in the first three samples and increased in the last three of the soil transect. Furthermore, the abundance of Proteobacteria seems to be relatively constant along the transect. It is also noteworthy that the Spirochaetes and Planctomycetes phyla were only found in sample T6.

1.4 – Predicted Functional profile from 16S rRNA gene Functional profiles and KEGG pathways, of endolithic and soil transect samples were predicted from the 16S rRNA gene using PICRUSt 104. Pathways associated with secondary metabolism were inspected in detail. The obtained results showed a similar pattern between the different samples. It is predicted that pathways involved in metabolism of different antibiotics – ansamycins, vancomycin, novobiocin, tetracycline, penicillin and streptomycin - are present virtually in all samples, yet with a low percentage. Remarkably, the pathway for streptomycin biosynthesis is predicted by PICRUSt to be overrepresented (0.3 – 0.4%) comparing to the remaining secondary metabolite biosynthesis pathways. In addition, the presence of pathways involved in FCUP 48 Exploring Polar Microbiomes as Source of Bioactive Molecules polyketide and nonribosomal peptide biosynthesis, two of the more important family of natural products, was also predicted.

Table 9 – Picrust KEGG pathways. Biosynthesis of Type II PKS was of 0 in all the samples (data not shown). The pathways with 0% in all samples were removed from the table.

Total END T1 T2 T3 T4 T5 T6

Metabolism;Metabolism of Terpenoids and Polyketides

Biosynthesis of ansamycins 0.1% 0.0% 0.1% 0.1% 0.1% 0.1% 0.0% 0.1% Biosynthesis of siderophore group nonribosomal peptides 0.1% 0.1% 0.0% 0.1% 0.1% 0.1% 0.0% 0.1% Biosynthesis of vancomycin group antibiotics 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Carotenoid biosynthesis 0.1% 0.2% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Geraniol degradation 0.5% 0.3% 0.3% 0.4% 0.4% 0.6% 0.6% 0.6% Limonene and pinene degradation 0.6% 0.3% 0.4% 0.5% 0.5% 0.6% 0.7% 0.6% Polyketide sugar unit biosynthesis 0.2% 0.2% 0.2% 0.2% 0.2% 0.2% 0.2% 0.2% Prenyltransferases 0.4% 0.5% 0.4% 0.4% 0.4% 0.4% 0.4% 0.3% Terpenoid backbone biosynthesis 0.6% 0.5% 0.5% 0.6% 0.6% 0.6% 0.6% 0.6% Tetracycline biosynthesis 0.1% 0.2% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1%

Metabolism;Biosynthesis of Other Secondary Metabolites

0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Butirosin and neomycin biosynthesis 0.0% 0.1% 0.1% 0.0% 0.0% 0.1% 0.1% 0.0% Flavonoid biosynthesis 0.1% 0.0% 0.1% 0.1% 0.0% 0.1% 0.0% 0.1% Isoquinoline alkaloid biosynthesis 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Novobiocin biosynthesis 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Penicillin and cephalosporin biosynthesis 0.1% 0.1% 0.1% 0.1% 0.2% 0.1% 0.1% 0.1% Phenylpropanoid biosynthesis 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Stilbenoid, diarylheptanoid and gingerol biosynthesis 0.4% 0.4% 0.3% 0.4% 0.3% 0.3% 0.4% 0.3% Streptomycin biosynthesis 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.1% Tropane, piperidine and pyridine alkaloid biosynthesis beta-Lactam resistance 0.0% 0.1% 0.1% 0.0% 0.0% 0.0% 0.0% 0.0%

Unclassified: Metabolism Biosynthesis and biodegradation of secondary metabolites 0,1% 0.1% 0.1% 0.1% 0.1% 0.1% 0.0% 0.1%

FCUP 49 Exploring Polar Microbiomes as Source of Bioactive Molecules

2- Biodiversity of culturable strains from the McMurdo Dry Valleys, Antarctica

In total, 13 Firmicutes, 13 Actinobacteria, 6 Proteobacteria and 4 fungal strains were isolated and identified from soil transect samples, yet others are in isolation process.

2.1 – Isolation, identification and phylogenetic analysis of obtained Isolates 2.1.1– Firmicutes strains Concerning members of the Firmicutes phylum, 13 strains were obtained from the environmental sample T6. These strains were grown at 4 ºC on an oligotrophic media (NPS) and were streaked on a more nutritive media (ISP2) until a pure culture was achieved. The strains revealed a very slow growth and favoured temperatures below 9 ºC. Morphologically, the strains are very similar, with light-coloured coccoid spherical cells (Figure 13). According to the molecular identification through the 16S rRNA gene phylogenetic analysis, the 13 strains belong to two (eventually three) different species from the Paenisporosarcina genus. The best blast(n) matches in the NCBI database show 99% identity to two different species – Paeniporosarcina macmurdensis strain CMS 21w and Paenisporosarcina indica strain PN2.

2F 13G 16D

Figure 13 – Microhotographs of Paeniporosrcina sp. isolates. From left right, Paenisporosarcina macmurdensis strain 2F, Paenisporosarcina macmurdensis strain 13G and Paenisporosarcina sp. 16D.

Concerning the phylogenetic analysis, from the phylogenetic tree (Figure 14) it is possible to observe that all the 13 obtained isolates group within the Paenisporosarcina clade and thus, belong to this genus. Strains 2F, 2H, 13F, 13G,16E, 17, 37 and 47 group together with Paenisporosarcina macmurdensis strain CMS 21w, supported by a strong bootstrap value (99). Taking also into account the genetic distance between the isolates and Paenisporosarcina macmurdensis strain CMS 21w on in the phylogenetic tree, these strains were assigned to this species. FCUP 50 Exploring Polar Microbiomes as Source of Bioactive Molecules

13G Paenisporosarcina 37 13F

2H 99 47 17 16E 85 2F Paenisporosarcina macmurdoensis strain CMS 21w (NR 025573.1)

Genus 36 42 39

69 16D 34 60 Paenisporosarcina antarctica strain N-05 (NR 044122.1)

Paenisporosarcina indica strain PN2 (NR 108473.1) Paeniporosarcina quisquiliarum strain SK 55 (NR 043720.1) 37 18 46

100 Viridibacillus arvi strain LMG 22165 (NR 025627.1) Viridibacillus arenosi strain LMG 22166 (NR 025628.1) 22 Psychrobacillus insolitus strain DSM 5 (NR 042709.1)

100 96 Psychrobacillus psychrodurans strain 68E3 (NR 025409.1) 98 Psychrobacillus psychrotolerans strain 3H1 (NR 025408.1) newyorkensis strain 6062 (NR 117567.1)

73 Sporosarcina psychrophila strain NBRC 15381 (NR 113752.1) Sporosarcina 95 Sporosarcina psychrophila strain W16A (NR 036942.1) Sporosarcina globispora strain NBRC 16082 (NR 113837.1) 79 48 84 Sporosarcina globispora strain 785 (NR 029233.1) 22 Filibacter limicola strain DSM 13886 (NR 042024.1)

Planococcus halocryophilus strain Or1 (NR 118149.1)

Genus 46 Sporosarcina koreensis strain F73 (NR 043526.1)

100 Sporosarcina saromensis strain NBRC 103571 (NR 114249.1) 69

Sporosarcina saromensis strain HG645 (NR 041359.1)

73 Sporosarcina thermotolerans strain CCUG 53480 (NR 116956.1)

64 Sporosarcina luteola strain NBRC 105378 (NR 114283.1) 91 Sporosarcina luteola strain Y1 (NR 112844.1) Microcystis aeruginosa strain NIES-843 (NR 074314.1) Gloeocapsa sp. PCC 7428 strain PCC 7428 (NR 102460.1)

0.020

Figure 14 - Phylogenetic tree of the 16S rRNA nucleotide sequences of the obtained isolates (2F, 2H, 13F, 13G, 16D, 16 E, 17, 34, 36, 39, 47) from Firmicutes phylum and the closest matches at NCBI 16S database. The tree was reconstructed using the Maximum Likelihood method based on the Tamura-Nei model 111, with 37 nucleotide sequences, in MEGA7 65.The 16S rRNA gene sequences from the cyanobacterial strains Microcystis aeruginosa strain NIES-843 and Gloeocapsa sp. PCC 7428 were included as outgroup.

The strains 36, 39, 16D and 34 group together in a subclade supported by a bootstrap value of 69 and might possibly represent a novel species from the Paenisporosarcina genus. Strain 46, group separately with S. qisquiliarium strain SK 55, however poorly supported by a bootstrap value of 18. FCUP 51 Exploring Polar Microbiomes as Source of Bioactive Molecules

The phylogenetic analysis could assign strain 2F, 2H, 13F, 13G,16E, 17, 37 and 47 to a known species, however it was not possible to assign the remaining strains to a species – and thus the most stringent classification was at the genus level (Paenisporasarcina sp.).

2.1.2– Actinobacterial strains From sample 5 of the soil transect 13 actinobacterial strains were obtained. Most of them were obtained from the Pre-treatment 1, dilution 10-1 in AIA medium, at 28ºC Morphologically, most of the strains are very similar, coccoid cells with a strong yellow colour (Figure 15), except for strains AT 20, 4 and 5 which exhibited smaller, light- coloured colonies. From the molecular identification through 16S rRNA gene phylogenetic analysis, all strains were found to belong to the order, distributed by three different genera: , , Flexivirga and .

Figure 15 -Photographs of Actinobacterial strains isolated from soil sample T5. A) Bacterial gowth in the initial plate of pre-treatment 1, 10-1 dilution, incubated at 28ºC. B) Micrococcus sp. strain AT7; C) Kocuria sp. strain AT14.

In the phylogenetic tree (Figure 16) it is possible to observe that the isolated strains group into one of two different families – and . Strains AT1, AT2, AT7, AT9, A19 and AT23 group within the Micrococcus clade, together with Micrococcus yunnanensis strain YIM 65004, Micrococcus luteus strain ATCC 4698 and Micrococcus aloeverae strain AE-6. The best blast(n) match for these strains is of 99% identity to Micrococcus yunnanensis strain YIM 65004, except strain AT9, that has 99% identity to Micrococcus aloeverae strain AE-6. Strains AT6(2), AT6(3)-1, AT14 and AT22 group together with Kocuria rhizohila strain TA68 and Kocuria arsenatis strain CM1E1 within the Kocuria genus clade, well supported by a bootstrap value of 99. The best blast(n) match for these strains is of 99% FCUP 52 Exploring Polar Microbiomes as Source of Bioactive Molecules identity to Kocuria rhizohila strain TA68 and in accordance to the genetic distance these strains were assigned to Kocuria rhizohila species. The phylogeny of strains AT5, AT4 and AT20 is well resolved on the phylogenetic tree. Strains AT4 and AT5 group with Dermacoccus nishinomiyaensis strain DSM 20448, supported by a high bootstrap value (97) and were assigned to this species. Strain AT20 has a best blast(n) match of 98% identity to Flexivirga alba. In addition to the identity, the phylogenetic analysis indicate that we might be in the presence of a new species of the Flexivirga genus.

2.1.3 – Proteobacteria Isolates Six strains were obtained from soil sample T5, but grew poorly under the tested conditions. The strains showed very similar morphological characters among them, with small colonies of white colour. From the molecular characterization using 16S rRNA gene, all the six isolates were showed high (99%) identity to Bradyrhizobium embrapense strain SEMIA 6208 and were thus assigned to this taxon. Although the 16S rRNA gene is commonly used for the molecular and phylogenetic identification of bacterial strains, in some genera, including Bradyrhizobium, it is highly conserved, limiting the species discrimination. Phylogenies from Bradyrhizobium genus are usually composed using a variety of molecular markers 117. Here, it was not possible to compute a well-supported phylogenetic tree able to discriminate the species and obtained strains.

FCUP 53 Exploring Polar Microbiomes as Source of Bioactive Molecules

AT9 Micrococcus aloeverae strain AE-6 (NR 134088.1) AT7 AT23 97 AT2 AT19 AT1

55 Micrococcus luteus strain ATCC 4698 (NR 114673.1) Micrococcus yunnanensis strain YIM 65004 (NR 116578.1) Micrococcus endophyticus strain YIM 56238 (NR 044365.1) 97 28 Micrococcus flavus strain LW4 (R 043881.1) Micrococcus lylae strain DSM 20315 (NR 026200.1) 94

Micrococcus antarcticus strain T2 (NR 025285.1) Micrococcaceae 39 Micrococcus cohnii strain WS4601 (NR 117194.1) 76 Micrococcus terreus strain V3M1 (NR 116649.1) Kocuria rosea strain DSM 20447 (NR 044871.1) Kocuria palustris strain TAGA27 (NR 026451.1) 97 95 Kocuria carniphila strain CCM 132 (NR 027193.1) 51 Kocuria gwangalliensis strain SJ2 (NR 116266.1) 44

Kocuria atrinae strain P30 (NR 116744.1) Family

57 Kocuria salsicia strain 104 (NR 117299.1) 99 Kocuria varians strain ATCC 15306 (NR 114674.1)

Kocuria marina strain KMM 3905 (NR 025723.1) 44 AT22

17 AT6(3)-1 Kocuria arsenatis strain CM1E1 (NR 148610.1) 99 AT14 Kocuria rhizophila strain TA68 (NR 026452.1) AT6(2)

Intrasporangium oryzae strain KV-657 (NR 041549.1) Dermacoccaceae

91 AT20 95 Flexivirga alba strain ST13 (NR 113034.1) 61 Rudaeicoccus suwonensis strain HOR6-4 (NR 108544.1) 46 Branchiibius cervicis strain PAGU 1247 (NR 113234.1) 83 Calidifontibacter indicus strain PC IW02 (NR 115977.1) 95

98 Yimella lutea strain YIM 45900 (NR 116716.1)

Luteipulveratus mongoliensis strain MN07-A0370 (NR 112830.1) Family

83 AT5 94 97 Dermacoccus nishinomiyaensis strain DSM 20448 (NR 044872.1) AT4 Dermacoccus abyssi strain MT1.1 (NR 043260.1) 71 Dermacoccus profundi strain MT2.2 (NR 112988.1) 99 Dermacoccus barathri strain MT2.1 (NR 043261.1) Dermacoccus profundi strain MT2.2 (NR 043262.1) Gloeocapsa sp. PCC 7428 strain PCC 7428 (NR 102460.1) 100 Microcystis aeruginosa strain NIES-843 (NR 074314.1)

0.050

Figure 16 - Phylogenetic tree of the 16S rRNA nucleotide sequences of the obtained isolates from Actinobacteria phylum and the closest matches at NCBI 16S database. The tree was reconstructed using the Maximum Likelihood method based on the Tamura-Nei model 111, with 46 nucleotide sequences, in MEGA7 65.The 16S rRNA sequences FCUP 54 Exploring Polar Microbiomes as Source of Bioactive Molecules from the cyanobacterial strains Microcystis aeruginosa strain NIES-843 and Gloeocapsa sp. PCC 7428 were included as outgroup.

2.1.4– Fungi Isolates Four fungal strains were also obtained during the isolation process from samples T5 and T6. Strain FG1 and FP1 were obtained from sample T5, AIA/SCN medium at 4 ºC, while strains 31 and 41 were obtained from soil sample T6, grown in ISP2 medium at 4 ºC. Morphologically, the strains are similar, but it is possible to visually distinguish them (Figure 17). From the molecular identification (sequences from ITS and D1/D2 region), the four isolates belong to the Ascomycota phylum, distributed by two different families, Xylariaceae and Aspergillacea. The best blast(n) matches range from 100-96% identity for the different isolates.

Figure 17 - Photographs of Fungi strains isolated from soil sample T5 and T6. (A) Penicillium citrinum strain 31 in ISP2 solid medium. (B) Dicyma pulvinata strain 41 on MEA solid medium. (C) Penicillium decaturense strain FG1 on AIA solid medium.

From the phylogenetic analysis, it is possible to observe that strains FP1 and FG1 are clonal strains and group together within Penicillium clade with P. decaturense species. The blast(n) match at NCBI nucleotide database is 99% identity to the mentioned species. Based on both molecular and phylogenetic analysis the strains were assigned to this species. Furthermore, strain 31, isolated from T6 sample also cluster within this clade, and groups clearly with P. citrinum species. The blast(n) analysis showed 100% identity to this taxon and thus strain 31 was assigned as P. citrinum. Strain 41, belonging to Xylariacea family was obtained. Results from molecular identification revealed 96% identity (91% coverage) to Dicyma pulvinata isolate CBS FCUP 55 Exploring Polar Microbiomes as Source of Bioactive Molecules

194.56. On the phylogenetic tree, the two strains group together and are supported by a high bootstrap value (95).

Penicillium canescens (AF034463.1) Penicillium rivolii strain NRRL 906 (AF033419.1) 58 Penicillium miczynskii strain KUC1551 (HM469400.1)

91 Aspergillacea 28 Penicillium miczynskii strain NRRL 1077 (AF033416.1) Penicillium waksmanii strain NRRL 777 (AF033417.1) Penicillium decaturense isolate NRRL 35636 (EF200091.1)

96 FG1 96 FP1 Penicillium decaturense (AF125946.1)

51

Penicillium decaturense strain KUC1522 (HM469399.1) Family

34 Penicillium decaturense strain NRRL 29840 (AY313619.1) Penicillium sumatrense strain NRRL 779 (AF033424.1)

100 Penicillium steckii strain KUC1681-1 (HM469415.1) Penicillium westlingii strain NRRL 800 (F033423.1) 31 93 Penicillium sartoryi strain NRRL 783 (AF033421.1) 93 Penicillium citrinum isolate 25R-3-F01 (KX958075.1) Penicillium citrinum strain LMI01 (KU686951.1) Xylaria sp. KU416 (AB073533.1) Xylariaceae

41 Family 99 Dicyma pulvinata isolate CBS 194.56 (KU683763.1)

57 Parapleurotheciopsis inaequiseptata strain MUCL 41089 (EU040235.1)

Barrmaelia macrospora strain BM (KC774566.1) 38

70 Anthostomella proteae CBS:110127 (EU552101.1) 70 Clypeosphaeria mamillana strain CLM1 (KT949898.1)

0.050 Figure 18 - Phylogenetic tree of the ITS and D1/D2 rDNA nucleotide sequences of the obtained Fungi isolates and the closest matches at NCBI 16S rRNA sequences database. The tree was reconstructed using the Maximum Likelihood method based on the Tamura-Nei model 111, with 25 nucleotide sequences, in MEGA7 65.

FCUP 56 Exploring Polar Microbiomes as Source of Bioactive Molecules

3– Bioactive potential of Isolated Microorganisms

3.1– PCR Screening of bacterial isolates: PKS and NRPS genes To gain insight into the genetic capacity of the new isolates for NP production, we used primers targeting the KS and AD domain of PKS and NRPS genes, respectively, to screen the bioactive potential of the obtained bacterial isolates, from Firmicutes, Actinobacteria and Proteobacteria Phyla.

M 1 2 3 4 5 6 7 8 9 10 11 C+ C-

5000 bp degK2F/degK2R 2000 bp 1500 bp 1000 bp

500 bp

M 1 2 3 4 5 6 7 8 9 10 11 C+ C-

5000 bp DKF/DKR 2000 bp 1500 bp

1000 bp

500 bp

Figure 19 - PCR amplification of KS using primer pair degK2F/deK2R (expected 700 bp) and DKF/DKR (expected 1000 bp). M – GRS ladder 1kb (Grisp), 1 – gDNA from isolate 2F, 2 – gDNA from isolate 2H, 3 – gDNA from isolate 13F, 4 – gDNA from isolate 13G, 5 – gDNA from isolate 16E , 6 – gDNA from isolate 17, 7 – gDNA from isolate 35, 8– gDNA from isolate 36, 9 – gDNA from isolate 37, 10– gDNA from isolate 46, 11– gDNA from isolate 47, 1C+ – gDNA from Streptomyces sp.S4R2 (positive control), C- – negative control (a) -using primer pair degK2F/degK2R; (b) using primer pair DKF/DKR.

Primer pairs degK2F/degK2R and DKF/DKR were used to carry out a PCR amplification of the KS domain from PKS genes on bacterial isolates from the Firmicutes phylum. The amplification results (Figure 19) indicate the presence of an amplified product with the expected bp length (700 bp) from the gDNA of Paenisporosarcina macmurdensis strains 2F, 2H, 13F, 17and 37, and from Paenisporosarcina sp. strain 46. Primers A3F/A7R and MTF/MTR were used to amplify the AD domain from NRPS genes (Figure 20). Amplified PCR products with the expected bp length were observed in all the tested isolates using primer A3F/A7R. With primer pair MTF/MTR (expected 1000 bp), amplification was observed using gDNA from Paenisporosarcina macmurdensis strain 2H and 17. FCUP 57 Exploring Polar Microbiomes as Source of Bioactive Molecules

M 1 2 3 4 5 6 7 8 9 10 11 C+ C- 5000 bp 2000 bp A3F/A7R 1500 bp 1000 bp

500 bp

MTF/MTR M 1 2 3 4 5 6 7 8 9 10 11 C-

5000 bp 2000 bp

1500 bp 1000 bp

500 bp

Figure 20 – PCR amplification of AD domain using primer pair A3F/A7R (expected 700 bp) and MTF/MTR (expected 1000 bp). Legend: M – GRS ladder 1kb (Grisp), 1 – gDNA from isolate 2F, 2 – gDNA from isolate 2H, 3 – gDNA from isolate 13F, 4 – gDNA from isolate 13G, 5 – gDNA from isolate 16E , 6 – gDNA from isolate 17, 7 – gDNA from isolate 35, 8– gDNA from isolate 36, 9 – gDNA from isolate 37, 10– gDNA from isolate 46, 11– gDNA from isolate 47, C+– gDNA from Streptomyces sp.S4R2 (positive control), C-– negative control. (a) using primer pair A3F/A7R. (b) using primer pair MTF/MTR.

Regarding Actinobacteria and Proteobacteria isolates, primer pairs A3F/A7R and degK2F/degK2R (specific for Actinobacteria 33) were used (Figure 21). An amplified product with the expected bp length (700 bp) was observed with gDNA from isolate Micrococcus sp. strain AT9 and Bradyrhizobium sp. strain AT10 and AT16 for KS domain and for with gDNA from Flexivirga sp. strain AT20 for AD domain.

M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 C+ C-

5000 bp degK2F/degK2R 2000 bp 1500 bp 1000 bp 500 bp

M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 C+ C- 5000 bp C- A3F/A7R 2000 bp 1500 bp

1000 bp

500 bp

Figure 21 – PCR amplification of KS and AD domain using primer pairs degK2F/degK2R (expected 700 bp) and A3F/A7R (expected 700 bp), respectively. Legend: M – 1kb ladder (GrisP), 1 – AT1, 2 – AT2, 3 – AT4, 4 – AT5, 5 – AT6, 6 – AT6(2), 8 – AT7, 9 – AT9, 10 – AT10, 11 – AT13, 12- AT16, 13 – AT19, 14 – AT20, 15 - AT22, C+ – S. natalensis ATCC 27448 (positive control), C- negative control. FCUP 58 Exploring Polar Microbiomes as Source of Bioactive Molecules

3.2 – Bioassay Screening Fast-growing isolates (Actinobacteria from Micrococcus and Kocuria genus and Fungi) were selected for cultivation in small scale (100 mL liquid medum) with a resin and then extracted with a mixture of MeOH:acetone to be tested in pharmacologically-relevant bioassays. Dicyma pulvinata strain 41, was selected to be extracted by two different methods, one as mentioned above (41(1)) and other with EtOAc 41(2) for comparative purposes. The fungal Penicillium citrinum strain 31 was selected for large scale growth in part due to fast growth and also due to the well-known potential of Penicillium species to produce secondary metabolites 115.

3.2.1 – Antimicrobial Assay In order to gain insight into the antimicrobial properties of the isolated strains, the organic extracts were tested against two Gram-positive and two Gram-negative bacteria and against as species of yeast by impregnating 15 µg into an antibiogram disk. Concerning the results from the organic extracts (Table 10), the extract from Micrococcus sp. strain AT1 revealed activity against the yeast C. albicans, giving rise to a small halo of inhibition, as visible in Figure 22. The extract from Kocuria rhizophila train AT14 revealed activity against a Gram-negative bacterium, Salmonella typhimurium (Table 7). The diameter of the halo was not possible to measure as it was not well-defined. Also, the extract from the fungi Dicyma pulvinata strain 41 (extracted with methanol:acetone) possessed activity against the yeast C. albicans (Figure 22).

Table 10 – Antimicrobial activity of organic extracts tested.

Activity (mm of inhibition against) Organic Extract from B.subtilis S.aureus E.coli S.typhymurium C.albicans Micrococcus sp. strain AT1 - - - - + (n.m.) Micrococcus sp. strain AT2 - - - - - Kocuria sp. strain AT6(2) - - - - - Kocuria sp. strain AT6(3)-1 - - - Micrococcus sp. strain AT7 - - - - - Micrococcus sp. strain AT9 Kocuria sp. strain AT14 - - - + - (n.m.) Micrococcus sp. strain AT19 - - - - - Kocuria sp. strain AT22 - - - - - Micrococcus sp. strain AT23 Dicyma pulvinata strain 41(1)* - - - - + (n.m.) Dicyma pulvinata strain 41(2)* - - - - - Penicillium citrinum strain 31 - - - - - Penicillium decaturense strain - - - - - FG1 n.m. – not possible to measure (in this case, the inhibition halo was not well defined).

FCUP 59 Exploring Polar Microbiomes as Source of Bioactive Molecules

C. albicans C. albicans S. typhimurium Figure 22 - Photographic record of inhibition halos. From left to right: inhibition by organic extract from

Microcococcus sp. strain AT1 and Dicyma pulvinata

strain 41(1) on C.albicans and inibiton by Kocuria sp.

strain AT14 extract on S. typhimurium

Twelve fractions were obtained from the VLC fractionation of the crude extract from Penicillium citrinum strain 31. The antimicrobial assays (Table 11) show that fractions 31B (eluted in 70% ethyl acetate (hexane)) and C (eluted in 90% ethyl acetate (hexane) revealed to be active against both Gram-negative and -positive bacteria. Fraction 31B showed activity against all the tested bacteria, with a halo of 9 mm in S.aureus. Fraction 31C also proved to be active mainly against E.coli (8 mm) and B.subtilis (8 mm), as depicted in Figure 23.

Table 11 - Antimicrobial activity of VLC fractions from the crud extract of Penicillium citrium strain 31.

Activity (mm of inhibition) against Fraction S. C. E.coli S.aureus B.subtilis typhymurium albicans 31-A - - - - - + + + + 31-B - (8mm) (9mm) (8mm) (7mm) + + + 31-C - - (8mm) (7mm) (8mm) 31-D - - - - - 31-E - - - - - + 31-F - - - - (7mm) 31-G - - - - - 31-H - - - - - 31-I - - - - - 31-J - - - - - 31-K - - - - - 31-L - - - - -

FCUP 60 Exploring Polar Microbiomes as Source of Bioactive Molecules

E.coli S.aureus S.typhimurium B.subtilis

E.coli S.aureus B.subtilis B.subtilis

Figure 23 - Photographic record of inhibition halos. From left to right: inhibition by 31B fraction on E. coli, S. aureus, S. typhymurium and B. subtilis. Down, inhibition of E. coli, S. aureus, B. subtilis by fraction 31 C and the last, inhibition of B.subtilis by fraction 31F.

Fraction 31B, giving rise to the strongest inhibition halo, was selected for further analysis. The fraction (106.65 mg) was sub-fractionated by flash chromatography, yielding nine sub-fractions which were tested using the same protocol. These proved to be more active than the tested fraction, as expected because the compounds became more concentrated due to the fractionation. Bacillus subtilis revealed to be the most susceptible bacterium (inhabited by eight out of nine sub-fractions) along with S. aureus (inhibited by three out of nine fractions) -Table 12. No additional activity was detected towards the remaining organisms. Sub-fractions 6, 7, and 8 proved to be the most active against B.subtilis (Figure 24).

Table 12 - Antimicrobial activity of flash-chromatography sub-fractions from 31B-1 to 31B-9.

Activity (mm of inhibition) against Sample E. coli S. aureus B. subtilis S. typhymurium C. albicans 31B-1 - - + - (9mm) 31B-2 - - + - (8mm) 31B-3 - - + - (8mm)

31B-4 - + + + - (8mm) (9mm)

31B-5 - + + - (8mm) (*)

31B-6 - - + + (10,4mm)

31B-7 - + + + (7mm) (10mm) 31B-8 - - + (10mm) 31B-9 - - - -

FCUP 61 Exploring Polar Microbiomes as Source of Bioactive Molecules

B.subtilis

S.aureus

B.subtilis B.subtilis B.subtilis

S.aureus

B.subtilis

B.subtilis S.aureus B.subtilis Figure 24 - Photographic record of inhibition halos of the subfractions tested. From left to right, inhibition on B. subtilis by subfrac tion B-1, B2, B3, inhibition by sbfraction 4 on S. aureus and on B. subtilis. Down, inhibition

3.2.2 – Cytotoxic Assay All the extracts and fractions tested for antimicrobial activity were also tested for cytotoxic properties using the MTT assay. Two different cell lines, SH-SY5Y (neurobastoma) and T47-D (breast ductal carcinoma) were used. For the SHSY-5Y cell line (Figure 25), fungal extracts proved to be more active than the bacterial ones, with a percentage of cell viability below 50% after 48 h of exposure to the extracts.

SHSY5Y Cell Line Assay 160 140 120 100 80 60 40

20 24H % of Cellular Viability Cellular of % 0 48H

Organic Extracts (15 µg mL-1)

Figure 25 - Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24 h and 48 h of exposure to organic extracts of Actinobacteria isolates. The extracts (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control). AT1 – Micrococcus sp., AT2 – Micrococcus sp., AT6 (2)- Kocuria rhizophila, AT6(3-1) – Kocuria rhizophila, AT7- Micrococcus sp., AT9- Micrococcus sp., AT14- Kocuria rhizophila, AT19 - Micrococcus sp., AT22 - Kocuria rhizophila, AT23 - Micrococcus sp., 31 –Penicillium citrinum, 41(1) – Dicyma pulvinata, extracted with MeOH:acetone, 41(2) - Dicyma pulvinata, extracted with EtoAc FG1 – Penicillium decaturense.

FCUP 62 Exploring Polar Microbiomes as Source of Bioactive Molecules

Concerning the bacterial extracts, only Micrococcus sp. strain AT1 and Kocuria sp. strain AT6(3)-1 showed moderate activities. For the T47-D cell line (Figure 26), the fungal extracts from Dycima pulvinata strain 41 (1 and 2) and Penicillium decaturense strain FG1 were also the most active. The bacterial extracts did not reveal any significant activity against this cell line.

T47-D Cell Line Assay

140

120

100

80

60

40

20 24H

of % Vabiliy%Cellular 0 48H

Organic Extracts (15 µg mL-1)

Figure 26 - Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma ), after 24 and 48h of exposure to organic extracts of Actinobacteria isolates. The extracts (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control). AT1 – Micrococcus sp., AT2 – Micrococcus sp., AT6 (2)- Kocuria rhizophila, AT6(3-1) – Kocuria rhizophila, AT7- Micrococcus sp., AT9- Micrococcus sp., AT14- Kocuria rhizophila, AT19 - Micrococcus sp., AT22 - Kocuria rhizophila, AT23 - Micrococcus sp., 31 –Penicillium citrinum, 41(1) – Dicyma pulvinata, extracted with MeOH:acetone, 41(2) - Dicyma pulvinata, extracted with EtoAc FG1 – Penicillium decaturense.

Regarding the VLC fractions of Penicillium citrinum strain 31, in SHSY-5Y cell line (Figure 27), the activity seems to be present in the more non-polar fractions (fraction A- G). Particularly, fraction A was the most active, with a percentage of cell viability of 42%, after 48H of exposure.

FCUP 63 Exploring Polar Microbiomes as Source of Bioactive Molecules

SHSY5Y Cell Line Assay - Fractions 31

140 120 100 80 60 24H 40

% of % Vialbility Cellular 20 48H 0

Fractions (15 µg/mL)

Figure 27 - Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24h and 48h of exposure to VLC fractions of Penicullium citrinum strain 31. The fractions (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control).

T47-D Cell Line Assay - Fractions 31

140 120 100 80 60 24H 40 48H

% of % ViabilityCellular 20 0

Fractions (15 µg mL-1)

Figure 28 -Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma), after 24 and 48h of exposure to VLC fractions of Penicullium citrinum strain 31. The fractions (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control).

For cell line T47-D, no significant activity was recorded. Regarding the activity of the subfractions from fraction 31B, the subfraction 31B-4 proved to be the most active on both cell lines testes with cell viability percentages of 56% and 84% after 48 h of exposure, in cell line SHSY-5Y and T-47D, respectively. FCUP 64 Exploring Polar Microbiomes as Source of Bioactive Molecules

SHSY5Y Cell Line Assay - Sufractions 31B

140 120 100 80 60 24H 40 48H % of % ViabilityCellular 20 0 31B-1 31B-2 31B-3 31B-4 31B-5 31B-6 31B-7 31B-8 31B-9 Pos. Neg. Control Control Subfractions (15 µg/mL)

Figure 29- Percentage of cell viability in the tumor cell line SH-SY5Y (neurobastoma), after 24h and 48h of exposure to VLC sub-fractions (fraction B) of Penicullium citrinum strain 31. The subfractions (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control).

T47D Cell Line Assay - 31B-subfractions 180 160 140 120 100 80 60 40 24H 20 % of % ViabilityCelullar 0 48H 31B-1 31B-2 31B-3 31B-4 31B-5 31B-6 31B-7 31B-8 31B-9 Pos. Neg. Control Control Subfractions (15 µg mL-1)

Figure 30 - Percentage of cell viability in the tumor cell line T47-D (breast ductal carcinoma), after 24h and 48h of exposure to VLC sub-fractions (fraction B) of Penicullium citrinum strain 31. The subfractions (solutions at 3 mg mL -1) were tested in a final concentration of 15µg mL-1, 20% DMSO was used as the positive control and 0.5% DMSO as the negative control (solvent control).

FCUP 65 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table 13 - Summary table of obtained isolates and results from PKS/NRPS genes and bioassays screening. The isolates non-identified or strains in isolation process were excluded from the table.

Isolate Suggested species Acession Query Identity Isolation Phylum PCR Screening Bioassay Screening number (NCBI) Cover (%) Source (%) PKS NRPS Antimicrobial Cytotoxic 2F Paenisporosarcina NR_025573.1 100 99 Soil-T61 Firmicutes + + n.a. n.a macmurdoensis strain CMS 21w 2H Paenisporosarcina NR_025573.1 100 99 Soil-T62 Firmicutes + + n.a n.a. macmurdoensis strain CMS 21w 13F Paenisporosarcina NR_025573.1 99 99 Soil-T61 Firmicutes + + n.a. n.a. macmurdoensis strain CMS 21w 13G Paenisporosarcina NR_025573.1 100 99 Soil-T63 Firmicutes - + n.a. n.a. macmurdoensis strain CMS 21w 16D Paenisporosarcina NR_025573.1 100 99 Soil-T61 Firmicutes n.p. n.p. n.a. n.a. macmurdoensis strain CMS 21w 16E Paenisporosarcina NR_025573.1 100 99 Soil-T62 Firmicutes - + n.a. n.a. macmurdoensis strain CMS 21w 17 Paenisporosarcina NR_025573.1 100 99 Soil-T63 Firmicutes + + n.a. n.a. macmurdoensis strain CMS 21w 34 Paenisporosarcina indica strain NR_108473.1 100 99 Soil-T61 Firmicutes n.p. n.p. n.a. n.a. PN2 36 Paenisporosarcina indica strain NR_108473.1 100 99 Soil-T63 Firmicutes - + n.a. n.a. PN2 37 Paenisporosarcina NR_025573.1 99 99 Soil-T63 Firmicutes + + n.a. n.a. macmurdoensis strain CMS 21w 39 Paenisporosarcina NR_025573.1 100 99 Soil-T62 Firmicutes n.p. n.p. n.a. n.a. macmurdoensis strain CMS 21w 46 Paenisporosarcina indica strain NR_108473.1 100 99 Soil-T63 Firmicutes + + n.a n.a. PN2 47 Paenisporosarcina NR_025573.1 99 99 Soil-T65 Firmicutes - + n.a n.a. macmurdoensis strain CMS 21w 31 Penicillium citrinum isolate 25R- KX958075.1 100 100 Soil -T64 Firmicutes n.p. n.p. + - 3-F01

41 Dicyma pulvinata KU683763.1 91 96 Soil -T64 Fungi n.p. n.p. + + FCUP 66 Exploring Polar Microbiomes as Source of Bioactive Molecules

AT1 Micrococcus yunnanensis strain NR_116578.1 100 99 Soil -T56 Actinobacteria - - + + YIM 65004 AT2 Micrococcus yunnanensis strain NR_116578.1 100 99 Soil -T56 Actinobacteria - - - - YIM 65004 AT4 Dermacoccus nishinomiyaensis NR_044872.1 100 99 Soil -T56 Proteobacteria - - n.a. n.a. strain DSM 20448 AT5 Dermacoccus nishinomiyaensis NR_044872.1 100 99 Soil -T56 Proteobacteria - - n.a. n.a. strain DSM 20448 AT6(2) Kocuria rhizophila strain TA68 NR_026452.1 100 99 Soil -T56 Actinobacteria + - - - AT6(3) Kocuria rhizophila strain TA68 NR_026452.1 100 99 Soil -T56 Actinobacteria n.t. n.t. + + -1 AT7 Micrococcus yunnanensis strain NR_116578.1 100 99 Soil -T56 Actinobacteria - - - - YIM 65004 AT9 Micrococcus aloeverae strain NR_134088.1 100 99 Soil -T56 Actinobacteria + - - - AE-6 AT10 Bradyrhizobium embrapense NR_145861.1 100 99 Soil -T57 Proeobacteria + - n.a. n.a. strain SEMIA 6208 AT11 Bradyrhizobium embrapense NR_145861.1 100 99 Soil -T57 Proeobacteria - - n.a. n.a. strain SEMIA 6208 AT12 Bradyrhizobium embrapense NR_145861.1 100 99 Soil -T57 Proeobacteria - - n.a. n.a. strain SEMIA 6208 AT13 Bradyrhizobium embrapense NR_145861.1 100 99 Soil-T58 Proeobacteria - - n.a. n.a. strain SEMIA 6208 AT14 Kocuria rhizophila strain TA68 NR_026452.1 100 99 Soil-T59 Actinobacteria n.t. n.t. + - 16S AT16 Bradyrhizobium embrapense NR_145861.1 100 100 Soil-T58 Proteoacteria + - n.a. n.a. strain SEMIA 6208 AT19 Micrococcus yunnanensis strain NR_116578.1 100 99 Soil-T510 Actinobacteria - - - - YIM 65004 AT20 Flexivirga alba strain ST13 NR_113034.1 99 98 Soil-T511 Actinobacteria - + n.a. n.a. AT21 Bradyrhizobium embrapense NR_145861.1 100 99 Soil-T58 Proteoacteria n.t. n.t. n.a. n.a. strain SEMIA 6208 AT22 Kocuria rhizophila strain TA68 NR_026452.1 100 99 Soil-T510 A ctinobacteria - - - - AT23 Micrococcus yunnanensis strain NR_116578.1 100 99 Soil-T512 Actinobacteria n.t. n.t. - - YIM 65004 FCUP 67 Exploring Polar Microbiomes as Source of Bioactive Molecules

FG1 Penicillium decaturense HM469399 100 99 Soil-T513 Fungi n.t. n.t - + FP1 Penicillium decaturense HM469399 100 99 Soil-T514 Fungi n.t. n.t. n.a n.a n.a not assayed n.t not tested In antimicrobial assays + was considered if an inhibition halo was visible. In cytotoxic assays + was considered if a percentage of cell viability below 75 was observed 1- Isolation in MNPS medium, at 4ºC, not diluted. 2-Isolation in MNPS medium, at 4ºC, dilution 10-2 3- Isolation in MNPS medium, at 4ºC, dilution 10-1 4- Isolation in MNPS medium, at 19ºC, dilution 10-1 5 – Isolation in MNPS medium, at 19º, not diluted. 6 – Isolation in AIA medium, at 28ºC, PT1, dilution 10-1. 7 – Isolation in AIA medium, at 28ºC, PT2, substrate inoculated. 8 Isolation in AIA medium, at 28ºc, PT2, dilution 10-1 9 - isolation in AIA medium, at 28ºC, PT2, not diluted. 10 – Isolation in SCN medium, at 28ºC, PT2, substrate inoculated. 11 – Isolation in AIA medium, at 28ºC, PT1, diution 10-2 12 - Isolation in Czapek medium, at 28ºC, PT2, dilution 10-2 13 -Isolation in SCN medium, at 4ºC, PT2, substrate inoculated 14 – Isolation in AIA medium, at 4ºC, PT2, substrate inoculate

FCUP 68 Exploring Polar Microbiomes as Source of Bioactive Molecules

V-Discussion

1 - Biodiversity and Functional Profile of Endolithic and Soil Microbiomes from the McMurdo Dry Valleys

The hyper-arid desert, McMurdo Dry Valleys, located in Victoria Land, comprises the largest ice-free area of continental Antarctica 118. Characterized by extreme environmental constraints, such as low water availability, below-zero temperatures, strong winds and high-UV exposure 119, it is considered one of the most inhospitable habitats, being restricted to microbial colonization 118. Oligotrophic mineral soils and exposed rocky surfaces compose the terrestrial scenery 120. Here, the microbial community composition of McMurdo Dry Valleys soil and rock niches (sandstone with endolithic colonization), was assessed through 454 pyrosequencing of the 16S rRNA gene. In accordance with previous reports 118,121, it was found that the habitat type (endolithic vs soil) clearly influences the bacterial community composition. Despite the fact that the most abundant bacterial phyla, were present in all samples, a clearly distinct taxonomic and phylogenetic composition is observed in the different niches under study and also, in accordance to environmental factors, namely water availability, which had a gradient along the studied transect. Also consistent with previous studies 121, diversity indices revealed soil samples as more diverse than the endolithic sample, according to the number of observed OTUs and Phylogenetic diversity (PD). Cyanobacteria are usually the dominant phyla in lithic-associated communities 122. Here, Cyanobacteria from Acaryochloris genus composed 44% of the total endolithic sample. Interestingly, this genus was found exclusively on this sample. Cyanobacteria closely related to Acaryochloris marina, has already been associated with endolithic communities in Antarctic granite rocks 123. These niches provide a barrier to penetration of organisms as well as a microclimate distinct from the exterior of the rock, with higher moisture levels 124. Actually, water availability seemed to clearly define a threshold or a limit for colonization of some bacterial phyla, mainly Cyanobacteria, along the soil transect. The percentage of Cyanobacteria, decreases from about 20% of the total sample to neglectable values on the last three samples of the transect (T4-T6). On the initial samples of the transect, Oscillatoriales genera Leptolyngbya and Phormidium were well represented. Leptolyngbya are usually associated with lake and maritime Antarctic communities 125,126, and Phormidium to water-saturated soils and river beds 127 . These findings suggest, as already shown 121,128, that water availability/distance to aquatic FCUP 69 Exploring Polar Microbiomes as Source of Bioactive Molecules ecosystems shapes the taxonomic composition of the bacterial communities. Inversely, Actinobacteria abundance increases on the driest samples. The desiccant-tolerant genus Rubrobacter 118 is also present in the endolithic sample. Interestingly, in contrast to previous studies 121,129,130, Proteobacteria and Bacteroidetes were highly represented on the tested samples. These phyla are generally dependent on high organic soil contents, which is not the case of most oligotrophic Antarctic soils 131. As expected, Acidobacteria, considered an oligotrophic phylum 129, were well represented and dispersed among the different samples. Remarkably, the endolithic sample was the only among those studied to harbour bacteria from Streptomycetaceae family (0.7%). The Streptomyces genus is known as a prolific source of secondary metabolites with a wide range of activities, responsible for 80% of all antibiotics of actinobacterial origin 78. Antibacterial compounds isolated from Antarctic Streptomyces species 87,88 have already been reported. Also, the Nocardiaceae family, present in most samples, includes species known to produce antimicrobial (e.g. abissomycn 132) and antitumor (e.g. asterobactine 133) compounds. Pathways involved in secondary metabolism were predicted from the 16S rRNA gene sequence diversity using PICRUSt 104. No marked differences were detected between the samples. The presence of pathways involved in polyketide and nonribosomal peptide biosynthesis and on metabolism of different antibiotics (families) – ansamycins 134, vancomycin 135, novobiocin 136, tetracycline 137, penicillin 1 and streptomycin 138- were detected virtually in all samples. These are produced by Actinobacteria from Streptomyces and Amycolatopsis genus, except penicillin, produced by Penicillium fungi, and are of PK and NRP nature, except streptomycin, an aminoglycoside antibiotic. Interestingly, the pathway for streptomycin biosynthesis is overrepresented when compared to the remaining secondary metabolite biosynthetic pathways, despite members from Streptomycetaceae family having only been detected in the endolithic sample. Antibiotic resistance (becta-lactam resistance) observed in endolithic and T1 sample is indicative of competitive interactions between the taxa, as already described on Antarctic soils 120. In general, transect samples T6, T1 and T5 proved to be the more diverse, and the last samples (T4-T6) mainly composed of Actinobacteria. Sample T5 and T6 were then selected for the next steps.

FCUP 70 Exploring Polar Microbiomes as Source of Bioactive Molecules

2 - Culture-dependent Isolation of Actinobacteria from McMurdo Dry Valleys

Culture-based studies on McMurdo Dry Valleys, have initially proposed a dominance of a small number of aerobic groups, and few anaerobic isolates 139. However, recent molecular-based phylogenetic studies have revealed microbial diversity of Antarctic Dry Valley soils as remarkably high 130. At least 14 different bacterial phyla have been described from Dry Valleys bacterial communities – dominated by Acidobacteria, Actinobacteria and Bacteroidetes 131. Still, culture-based studies have in general, retrieved some specific genera of the Actinobacteria phylum such as Arthrobacter, Brevibacterium, Corynebacterium, Micrococcus, Nocardia and Streptomyces 140. It is now well-known that the cultivable fraction of the microbial richness is typically below 1% 45. Different strategies to improve the culturability of microorganisms have started to be used, including pretreatment strategies and oligotrophic media 81. These have provided fruitful results, in particular in Antarctic 141. In fact, previous studies have revealed Antarctic edaphic bacteria resistant to cultivation but recently, Pulschen 141 has shown that it is possible to grow recalcitrant bacteria from Antarctic soils by using longer incubation periods, lower temperatures and oligotrophic media. In the present study, different culture isolation methods, including mimicking of oligotrophic conditions and pretreatments were employed, directed to isolation of Actinobacteria strains from Dry Valleys soil samples. Attempts to isolate actinobacteria from sample T6, using an oligotrophic medium, were unsuccessful. From the 15 identified microbial isolates, 13 belonged to Paenisporosarcina genus and 2 Fungi belonging to the Penicillium and Dicyma genera. According to the pyrosequencing taxonomic composition data, Nocardioidaceae, Sporichthyaceae, Rubrobacteraceae, Gaiellaceae and Patulibacteraceae are the actinobacterial families predominant in sample T6. Members of Nocardioidaceae family have previously been isolated from Antarctica, and particularly from Dry Valleys 142. Members of Rubrobacter genus have been detected in clone libraries from Victoria Land 143 and Ross Island soils 144, but not cultured from Antarctic soils, yet very recently Pulschen (2017) 141 has described the isolation of two isolates sharing 99 and 95% of identity to sequences of Rubrobacteridae bacterium Gsoil 319 and Gsoil 1167. Concerning the obtained bacterial isolates, all belong to Paenisporosarcina genus and have previously been isolated from cold environments 145, including Dry Valleys 146. The genus Paenisporosarcina was recently established 147 and resulted in the reclassification of several Sporosarcina species to the novel genus, including Sporosarcina macmurdensis strain CMS 21w and Sporosarcina antarctica 145. The majority of the FCUP 71 Exploring Polar Microbiomes as Source of Bioactive Molecules obtained isolates group in the phylogenetic tree with the reclassified Paeniporosarcina macmurdenis strain CMS 21w, a psychrophilic bacteria isolated from a pond in Wright Valley, McMurdo Dry Valleys, Antarctica 146 and Paenisporosarcina indica strain PN2 was isolated from a soil sample close to the Pindari Glacier, of the Himalayan region 145. The phylogenetic analysis indicates that a new species from this genus might be present among the isolated strains. Interestingly, in accordance to the NGS analysis, the distribution of Paenisporosarcina genus is of 0% in sample T6. However, according to alpha diversity analysis, the bacterial diversity might not have been fully covered from the sequencing effort. Two fungi strains were isolated, belonging to the Penicillium and Dicyma genera. The Penicillium genus, initially described by Link 1809 148 comprises almost 300 species 149, widespread around the globe in diverse habitat, including in Antarctica, as P. antarcticum 150 and P. tardochrysogenum 151, isolated in Windmill Islands and Dry Valleys, respectively. The Dicyma genus, that harbours species used as a biological control 152 this is the first report of isolation from the Antarctic continent. For isolation of Actinobacteria from sample T5, two different pretreatment strategies – heatshock and incubation with antibiotics – were used. In sample T5, the dominant family was clearly Sporichthyaceae, followed by Patulibacteraceae, Nocardioidaceae and Rubrobacteraceae. There are no records of cultivation of isolates from Sporichthyaceae family in Antarctica. From the 13 actinobacterial strains identified, all belong to Micrococcales order, from two different families (Micrococcaceae and Dermacoccaceae) and four different genera. Curiously, accordingly to the pyrosequencing data, Micrococcaceae were not found in sample T5, while Dermacoccaceae were not detected in any sample. Bacterial species from Micrococaaceae family are commonly retrieved from Antarctic culture-based studies 131,140, including species isolated in Antarctica 153. The obtained isolates were assigned to two different genera from this family – Micrococcus and Kocuria. The Micrococcus genus was first described in 1872 by Cohn 154. The Kocuria genus has resulted from the phylogenetic and chemotaxonomic division of Micrococcus genus 155, and both include species isolated in Antarctica 153,156. Species from Dermacoccus genus have also been previously isolated 157. For the remaining genus, Flexivirga, there is no report for previous isolation in Antarctica. The isolate AT20 groups with F. alba, a species isolated from soil near wastewater treatment facilities 158. According to the phylogenetic analysis and the16S rRNA identity (98%), this isolate might represent a new species from the Flexivirga genus. Proteobacteria isolates were also obtained, belonging to Bradyrhizobium genus. This genus, recognized as a slow-growing rhizobium, initially described by Jordan in 1982 159, FCUP 72 Exploring Polar Microbiomes as Source of Bioactive Molecules

is composed of bacteria able to establish N2-fixing symbioses with leguminous species and has already been report in cold ecosystems, including Antarctica 160. B. embrapense strain SEMIA 6208 was originally isolated in Colombia from a nodule of Desmodium heterocarpon 161. Two Penicillium strains were also obtained during the bacterial isolation process using sample T5, identified as P. decaturense. This species was originally isolated from wood- decay fungi, in North America 162. It is important to note that besides samples being preserved at -80ºC in life-guard solution, some diversity may have not been recovered due to a decrease in viability associated with long-term storage in this solution. In addition, a lot of variables can dictate the ability of bacteria to grow, from culture media compositions 81, to more complex requirements, as the presence of specific growth signals 163, or dependency on other microorganism 164.

2 - Bioactive potential from McMurdo Dry Valleys Microbial Isolates

Microorganisms from extreme environments, including polar environments 165, are recognized as a rich source of novel bioactive molecules 54. However, only a few efforts have been directed at screening for their bioactive potential. Here, we report the bioactive screening of bacterial and fungi isolates obtained from Dry Valleys soils. Initially, a PCR screening for the presence of PKS and NRPS genes was performed to evaluate the biosynthetic potential of the isolates, as currently performed 33,166. Despite the promising potential (PCR screening) of Paenisporosarcina strains to produce bioactive compounds, attempts to grow these in liquid medium failed. No previous reports concerning bioactive potential from this genus were found, but there is, for Sporosarcina, a small number of reports of antibacterial activity 167,168. In total, 13 strains were tested on bioassays screening. The organic extract from Micrococcus sp. strain AT1 has inhibited slightly the growth of C. albicans and reduced the cell viability of SHSY-5Y cell line. The best blast(n) match in NCBI for strain AT1 is of 99% identity to Micrococcus yunnanensis strain YIM 65004. This species has been reported to produce Kocurin 166, a new member of the thiazolyl peptide family of antibiotics and to possess type II PKS genes. Recently, a novel broad-activity antibacterial compound isolated from another strain has been reported 169. The organic extract from Kocuria rhizophila strains AT6(3)-1 and AT14 proved to be active in both assay and on antibacterial, respectively. Kocuria species have also been reported to produce the antibiotic kocurin 166, while no cytotoxic activity has been FCUP 73 Exploring Polar Microbiomes as Source of Bioactive Molecules described. However, rare endophytic Actinobacteria (as are the two mentioned genus) are known to produce cytotoxic compounds 170. Organic extracts from fungi proved to be notoriously more active than bacterial ones, on both assays. Extracts from Dicyma pulvinata strain 41 were active against C. albicans and able to reduce cell viability of the two cell lines tested. To our knowledge, this is the first report of bioactivity for this species, and no reports were found for the three species that make up this genus, apart from an neuroactive ergot alkaloid 171 that has been isolated from Dicyma sp.. Both Penicillium species proved to be active. A recent study by Nielsen (2017) 115 has revealed un unexploited potential for secondary metabolites production by this genus. The organic extract from Penicillium decaturense strain FG1 reduced the cell viability of both cell lines tested. P. decaturense has been described as source of insecticide alkaloids 172, and recently as source of cytotoxic, antibacterial and antifungal compounds173. Fractions and subfractions from P. citrinum exhibited a consistent antibacterial activity, mainly against B.subtilis and S.aureus. This species is known to produce several alkaloids (as citridin A174 and perinadine A175), and polyketides176,177. Furthermore, this species is a source of citrinin178, a mycotoxin with antibiotic, antifungal and anticarcinogenic properties, among others. Citriquinone A179 has been reported to inhibit the growth of Bacillus sp., which is in line with the obtained results. Nevertheless, this is the first report of bioactive screening from a strain isolated from Dry Valleys, in Antarctica, and thus, the potential for discovery of novel bioactive molecules cannot be discarded. Although Dermacoccus strains have not been screened herein, Dermacoccus abyssi strain MT1 have been reported to produce dermacozines180, reported to have antitumour, antiprotozoal and free radical scavenging activities. One of the barriers that we encountered was the inability to grow the microorganisms in large scale, as most of them exhibit slow growth, and do not grow at all at temperatures above 20ºC. Further, with screening at organic extract level, great part of the bioactivity might not be achieved, as demonstrated with P.citrinum strain 31. While no significant activity was detected in the crude extract, the fractions and subfractions exhibited antibacterial activity. It was also observed that different methods of organic extraction can influence the results – as verified with organic extracts from Dicyma pulvinata strain 41. Overall, in this chapter, Antarctic microbiomes have revealed to be a potential source of prolific phyla, concerning the distribution of Actinobacteria and Cyanobacteria. The isolation of microorganisms from this environment has proved to be challenging, and future work will include further isolation and culture growth optimizations. FCUP 74 Exploring Polar Microbiomes as Source of Bioactive Molecules

In addition, and despite the power of pyrosequencing technologies, it was found that culture-methods were able to retrieve some rare taxons, not detected on any of the pyrosequencing data, something that our group had encountered before 181. Future work will include the isolation of the compounds responsible for the bioactivities and further assays on the promising strains, as Dicyma pulvinata strain 41. Furthermore, it will be interesting to characterize the bioactive potential from Flexivirga sp. strain AT20, a potential new species from the rare genus Flexivirga, not previously assayed. These strains appear to contain biosynthetic domains of NRPS genes, according to our survey.

FCUP 75 Exploring Polar Microbiomes as Source of Bioactive Molecules

VI –General Conclusion

With this work, we have developed new degenerate primer pairs that allow amplification of biosynthetic domains (KS and AD) from a wider range of bacterial phyla, when comparing to the currently published primers. If our predictions are confirmed, our primer pairs will constitute an important addition to the toolbox for metagenomic PCR-based PSK and NRPS gene mining. Higher amount of diversity and information retrieved from the environment, can lead to the identification of hotspots of biosynthetic diversity and allow for selection of promising environments for isolation of novel molecules. Nevertheless, our primers have already proved useful for the survey of biosynthetic domains from bacterial isolates, from at least Actinobacteria, Cyanobacteria, Planctomycetes and Proteobacteria. We have also provided a glimpse into the biodiversity and bioactive potential from McMurdo Dry Valleys in Antarctica, by revealing an abundance of prolific phyla, as Actinobacteria and Cyanobacteria. The bioactive screening has revealed fungal strains apparently with stronger bioactivities than the bacterial isolates, in particular Dicyma pulvinata strain 41. To the best of our knowledge, this is the first report of bioactive properties for this species. Fractions and subfractions from P. citrinum strain 31, were shown to be able to inhibit the growth of Gram-positive bacteria, in particular of B.subtilis. Future work will include isolation of the compounds responsible for the bioactivities and further assays for the promising strains, including the potential novel species from Flexivirga genus.

FCUP 76 Exploring Polar Microbiomes as Source of Bioactive Molecules

VII - References

1. Fleming, A. On the antibacterial action of cultures of a penicillium, with special reference to their use in the isolation of B. influenzae. Bull. World Health Organ. 79, 226–236 (1929). 2. Cragg, G. M. & Newman, D. J. Natural Products: a continuing source of novel drug leads. Biochim. Biophys. Acta 1830, 3670–3695 (2013). 3. Wang, H., Fewer, D. P., Holm, L., Rouhiainen, L. & Sivonen, K. Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes. Proc. Natl. Acad. Sci. U. S. A. 111, 9259–9264 (2014). 4. Fisch, K. M. & Schaberle, T. F. Toolbox for Antibiotics Discovery from Microorganisms. Arch. Pharm. (Weinheim). 683–691 (2016). doi:10.1002/ardp.201600064 5. Fischbach, M. A. & Walsh, C. T. Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: Logic machinery, and mechanisms. Chem. Rev. 106, 3468–3496 (2006). 6. Ding, X. et al. Bacterial biosynthesis and maturation of the didemnin anticancer agents. J. Am. Chem. Soc. 1320, 8625–8632 (2013). 7. Medema, M. H. & Fischbach, M. a. Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015). 8. Weber, G., Schörgendorfer, K., Schneider-Scherzer, E. & Leitner, E. The peptide synthetase catalyzing cyclosporine production in Tolypocladium niveum is encoded by a giant 45.8-kilobase open reading frame. Curr. Genet. 26, 120–125 (1994). 9. Wang, H., Sivonen, K. & Fewer, D. P. Genomic insights into the distribution, genetic diversity and evolution of polyketide synthases and nonribosomal peptide synthetases. Curr. Opin. Genet. Dev. 35, 79–85 (2015). 10. Medema, M. H. The Minimum Information about a Biosynthetic Gene cluster. Nat. Chem. Biol. 11, 625–631 (2015). 11. Ziemert, N. et al. The natural product domain seeker NaPDoS: A phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS One 7, 1–9 (2012). 12. Pang, B., Wang, M. & Liu, W. Cyclization of polyketides and non-ribosomal peptides on and off their assembly lines. Nat. Prod. Rep. 00, 1–12 (2016). 13. Jenke-Kodama, H., Sandmann, A., Müller, R. & Dittmann, E. Evolutionary implications of bacterial polyketide synthases. Mol. Biol. Evol. 22, 2027–2039 (2005). 14. Hekfrich, E., Reiter, S. & Piel, J. Recent advances in genome-based polyketide discovery. Curr. Opin. Biotechnol. 29, 107–115 (2014). 15. Shen, B. Polyketide biosynthesis beyond the type I, II and III polyketide synthase paradigms. Curr. Opin. Chem. Biol. 7, 285–295 (2003). 16. Bode, H. B. & Müller, R. The impact of bacterial genomics on natural product research. Angew. Chemie - Int. Ed. 44, 6828–6846 (2005). 17. Cheng, Y.-Q., Tang, G.-L. & Shen, B. Type I polyketide synthase requiring a discrete acyltransferase for polyketide biosynthesis. Proc. Natl. Acad. Sci. U. S. A. 100, 3149– 3154 (2003). 18. Leclère, V., Weber, T., Jacques, P. & Pupin, M. in Nonribossomal Peptide and Polyketide Biosynthesis 209–232 (2016). doi:10.1007/978-1-4939-3375-4_1 19. Dejong, C. A. et al. Polyketide and nonribosomal peptide retro-biosynthesis and global gene cluster matching. Nat. Chem. Biol. 12, 1007–1014 (2016). 20. Maclean, D., Jones, J. D. G. & Studholme, D. J. Application of ‘next-generation’ sequencing technologies to microbial genetics. Nat. Rev. Microbiol. 7, 96–97 (2009). 21. Walsh, C. T. & Fischbach, M. A. Natural products version 2.0: Connecting genes to molecules. J. Am. Chem. Soc. 132, 2469–2493 (2010). FCUP 77 Exploring Polar Microbiomes as Source of Bioactive Molecules

22. Bachmann, B. O. & Ravel, J. in Methods in Enzymology, V 458, 181–217 (Elsevier Inc., 2009). 23. Adamek, M., Spohn, M., Stegmann, E. & Ziemert, N. in Antibiotics: Methods and Protocols 1520, 291–306 (2017). 24. Skinnider, M. A. et al. Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM). Nucleic Acids Res. 43, 9645–9662 (2015). 25. Zarins-Tutt, J. S. et al. Prospecting for new bacterial metabolites: a glossary of approaches for inducing, activating and upregulating the biosynthesis of bacterial cryptic or silent natural products. Nat. Prod. Rep. 33, 54–72 (2016). 26. Zerikly, M. & Challis, G. L. Strategies for the discovery of new natural products by genome mining. ChemBioChem 10, 625–633 (2009). 27. Bachmann, B. O., Van Lanen, S. G. & Baltz, R. H. Microbial genome mining for accelerated natural products discovery: Is a renaissance in the making? J. Ind. Microbiol. Biotechnol. 41, 175–184 (2014). 28. Aleti, G., Sessitsch, A. & Brader, G. Genome mining: Prediction of lipopeptides and polyketides from Bacillus and related Firmicutes. Comput. Struct. Biotechnol. J. 13, 192– 203 (2015). 29. Brady, S. F. et al. Pantocin B, an antibiotic from Erwinia herbicola discovered by heterologous expression of cloned genes. J. Am. Chem. Soc. 121, 11912–11913 (1999). 30. Konstantinidis, K. T. & Tiedje, J. M. Trends between gene content and genome size in prokaryotic species with larger genomes. Proc. Natl. Acad. Sci. U. S. A. 101, 3160–3165 (2004). 31. Boddy, C. N. Bioinformatics tools for genome mining of polyketide and non-ribosomal peptides. J. Ind. Microbiol. Biotechnol. 41, 443–450 (2014). 32. Jumpathong, J., Peberdy, J., Fujii, I. & Lumyong, S. Chemical investigation of novel ascomycetes using PCR based screening approaches. World J. Microbiol. Biotechnol. 27, 1947–1953 (2011). 33. Ayuso-Sacido, A. & Genilloud, O. New PCR primers for the screening of NRPS and PKS-I systems in actinomycetes: Detection and distribution of these biosynthetic gene sequences in major taxonomic groups. Microb. Ecol. 49, 10–24 (2005). 34. Moffitt, M. C. & Neilan, B. A. Evolutionary affiliations within the superfamily of ketosynthases reflect complex pathway associations. J. Mol. Evol. 56, 446–457 (2003). 35. Neilan, B. a. et al. Nonribosomal peptide synthesis and toxigenicity of cyanobacteria. J. Bacteriol. 181, 4089–4097 (1999). 36. Metsa-Ketela, M. et al. An efficient approach for screening minimal PKS genes from Streptomyces. FEMS Microbiol. Lett. 180, 1–6 (1999). 37. Caboche, S. et al. NORINE: A database of nonribosomal peptides. Nucleic Acids Res. 36, 326–331 (2008). 38. Weber, T. et al. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res. 43, W237–43 (2015). 39. Röttig, M. et al. NRPSpredictor2 - A web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, 1–6 (2011). 40. Reddy, B., Milshteyn, A., Charlop-Powers, Z. & Brady, S. ESNaPD: A Versatile, Web- Based Bioinformatics Platform for Surveying and Mining Natural Product Biosynthetic Diversity from Metagenomes. Chem. Biol. 21, 1023–1033 (2014). 41. Khaldi, N. et al. SMURF: genomic mapping of fungal secondary metabolite clusters. Fungal Genet. Biol. 47, 736–741 (2010). 42. Rossello-Mora, R. & Amann, R. The species concept for procaryotes. FEMS Microbiol. Ecol. 25, 39–67 (2001). 43. Nováková, J. & Farkašovský, M. Bioprospecting microbial metagenome for natural products. Biologia (Bratisl). 68, (2013). 44. Amann, R. I., Ludwig, W., Schleifer, K. H., Amann, R. I. & Ludwig, W. Phylogenetic FCUP 78 Exploring Polar Microbiomes as Source of Bioactive Molecules

identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59, 143–169 (1995). 45. Epstein, S. S. The phenomenon of microbial uncultivability. Curr. Opin. Microbiol. 16, 636–642 (2013). 46. Goll, J. B., Szpakowski, S., Krampis, K. & Nelson, K. E. in Bioinformatics and Data Analysis in 259 (2014). doi:7 47. Tambadou, F. et al. Novel nonribosomal peptide synthetase (NRPS) genes sequenced from intertidal mudflat bacteria. FEMS Microbiol. Lett. 357, 123–130 (2014). 48. Woodhouse, J. N., Fan, L., Brown, M. V, Thomas, T. & Neilan, B. A. Deep sequencing of non-ribosomal peptide synthetases and polyketide synthases from the microbiomes of Australian marine . ISME J. 7, 1842–51 (2013). 49. Brady, S. F., Simmons, L., Kim, J. H. & Schmidt, E. W. Metagenomic Approaches to Natural Products from Free-Living and Symbiotic Organisms. Nat. Prod. Rep. 26, 1488– 1503 (2009). 50. Charlop-Powers, Z., Owen, J. G., Reddy, B. V. B., Ternei, M. a & Brady, S. F. Chemical- biogeographic survey of secondary metabolism in soil. Proc. Natl. Acad. Sci. U. S. A. 111, 3757–62 (2014). 51. Reddy, B. V. B. et al. Natural product biosynthetic gene diversity in geographically distinct soil microbiomes. Appl. Environ. Microbiol. 78, 3744–3752 (2012). 52. Charlop-Powers, Z. et al. Urban park soil microbiomes are a rich reservoir of natural product biosynthetic diversity. PNAS 1–6 (2016). doi:10.1073/pnas.1615581113 53. Amos, G. C. A. et al. Designing and implementing an assay for the detection of rare and divergent NRPS and PKS clones in European, Antarctic and Cuban soils. PLoS One 10, 1– 15 (2015). 54. Wilson, Z. E. & Brimble, M. a. Molecules derived from the extremes of life. Nat. Prod. Rep. 26, 44–71 (2009). 55. Zhao, J., Yang, N. & Zeng, R. Phylogenetic analysis of type I polyketide synthase and nonribosomal peptide synthetase genes in Antarctic sediment. Extremophiles 12, 97– 105 (2008). 56. Schirmer, A. et al. Metagenomic Analysis Reveals Diverse Polyketide Synthase Gene Clusters in Microorganisms Associated with the Marine Discodermia dissoluta. Appl. Environ. Microbiol. 71, 4840–4849 (2005). 57. Hug, L. A. et al. A new view of the tree and life’s diversity. Nat. Microbiol. 1, 1–6 (2016). 58. Radjasa, O. K. et al. Antagonistic Activity of a Marine Bacterium Pseudalteromonas luteoviolacea TAB4.2 Associated with Coral Acropora sp. J. Biol. Sci. 7, 239–246 (2007). 59. Schirmer, A. et al. Metagenomic Analysis Reveals Diverse Polyketide Synthase Gene Clusters in Microorganisms Associated with the Marine Sponge Discodermia dissoluta. Appl. Environ. Microbiol. 71, 4840–4849 (2005). 60. Song, J., Dong, X., Jiao, B.-H. & Wang, L.-H. Directly accessing the diversity of bacterial type i polyketide synthase gene in chinese soil and seawater. African J. Microbiol. Res. 7, 4065–4072 (2013). 61. Wawrik, B., Kerkhof, L., Zylstra, G. J., Jerome, J. & Kukor, J. J. Identification of Unique Type II Polyketide Synthase Genes in Soil Identification of Unique Type II Polyketide Synthase Genes in Soil. Appl. Environ. Microbiol. 71, 2232–2238 (2005). 62. Seow, K. et al. A Study of Iterative Type II Polyketide Synthases , Using Bacterial Genes Cloned from Soil DNA : a Means To Access and Use Genes from Uncultured Microorganisms. J. Bacteriol. 179, 7360–7368 (1997). 63. Ansari, M. Z., Yadav, G., Gokhale, R. S. & Mohanty, D. NRPS-PKS: A knowledge-based resource for analysis of NRPS-PKS megasynthases. Nucleic Acids Res. 32, 405–413 (2004). 64. Flissi, A. et al. Norine, the knowledgebase dedicated to non-ribosomal peptides, is now open to crowdsourcing. Nucleic Acids Res. 44, D1113–D1118 (2016). FCUP 79 Exploring Polar Microbiomes as Source of Bioactive Molecules

65. Kumar, S., Stecher, G., Tamura, K. & Medicine, E. MEGA7 : Molecular Evolutionary Genetics Analysis version 7 . 0 for bigger datasets. Mol. Biol. Evol. 1–11 (2016). 66. Kearse, M. et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647– 1649 (2012). 67. Kibbe, W. A. OligoCalc: An online oligonucleotide properties calculator. Nucleic Acids Res. 35, 43–46 (2007). 68. Bikandi, J., Millán, R. S., Rementeria, A. & Garaizar, J. In silico analysis of complete bacterial genomes: PCR, AFLP-PCR and endonuclease restriction. Bioinformatics 20, 798–799 (2004). 69. San Millán, R. M., Martínez-Ballesteros, I., Rementeria, A., Garaizar, J. & Bikandi, J. Online exercise for the design and simulation of PCR and PCR-RFLP experiments. BMC Res. Notes 6, 513 (2013). 70. Stothard, P. The sequence Manipulation Suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28, 1102–1104 (2000). 71. Brito, Â. et al. Bioprospecting Portuguese Atlantic coast cyanobacteria for bioactive secondary metabolites reveals untapped chemodiversity. Algal Res. 9, 218–226 (2015). 72. Lage, O. M. & Bondoso, J. Planctomycetes diversity associated with macroalgae. FEMS Microbiol. Ecol. 78, 366–375 (2011). 73. J Kotai. Instructions for Preparation of Modified Nutrient Solution Z8 for Algae. Nor. Inst. Water Res. (1972). 74. Korbie, D. J. & Mattick, J. S. Touchdown PCR for increased specificity and sensitivity in PCR amplification. Nat. Protoc. 3, 13–15 (2008). 75. Fouces, R., Mellado, E., Diez, B. & Barredo, J. L. The tylosin biosynthetic cluster from Streptomyces fradiae: genetic organization of the left region. Microbiology 145, 855– 868 (1999). 76. Newman, D. J. & Cragg, G. M. Natural Products as Sources of New Drugs from 1981 to 2014. J. Nat. Prod. 79, 629–661 (2016). 77. Núñez-Pons, L. & Avila, C. Natural products mediating ecological interactions in Antarctic benthic communities: a mini-review of the known molecules. Nat. Prod. Rep. 32, 1114–30 (2015). 78. Barka, E. A. et al. Taxonomy, Physiology , and Natural Products of Actinobacteria. Microbiol. Mol. Biol. Rev. 80, 1–43 (2016). 79. Burja, A. M., Banaigs, B., Abou-Mansour, E., Grant Burgess, J. & Wright, P. C. Marine cyanobacteria—a prolific source of natural products. Tetrahedron 57, 9347–9377 (2001). 80. Schüffler, A. & Anke, T. Fungal natural products in research and development. Nat. Prod. Rep. 31, 1425–1448 (2014). 81. Xiong, Z.-Q., Wang, J.-F., Hao, Y.-Y. & Wang, Y. Recent Advances in the Discovery and Development of Marine Microbial Natural Products. Mar. Drugs 11, 700–717 (2013). 82. Pye, C. R., Bertin, M. J., Lokey, R. S., Gerwick, W. H. & Linington, R. G. Retrospective analysis of natural products provides insights for future discovery trends. Proc. Natl. Acad. Sci. 114, 5601–5606 (2017). 83. Sarmiento-Vizcaíno, A. et al. Paulomycin G, a New Natural Product with Cytotoxic Activity against Tumor Cell Lines Produced by Deep-Sea Sediment Derived Micromonospora matsumotoense M-412 from the Avilés Canyon in the Cantabrian Sea. Mar. Drugs 15, 271 (2017). 84. Rojas, J. L. et al. Bacterial diversity from benthic mats of Antarctic lakes as a source of new bioactive metabolites. Mar. Genomics 2, 33–41 (2009). 85. Godinho, V. M. et al. Diversity and bioprospecting of fungal communities associated with endemic and cold-adapted macroalgae in Antarctica. ISME J. 7, 1434–1451 (2013). 86. Asthana, R. K. et al. Isolation and identification of a new antibacterial entity from the FCUP 80 Exploring Polar Microbiomes as Source of Bioactive Molecules

Antarctic cyanobacterium Nostoc CCC 537. J. Appl. Phycol. 21, 81–88 (2009). 87. Bruntner, C. et al. Frigocyclinone, a novel angucyclinone antibiotic produced by a Streptomyces griseus strain from Antarctica. J. Antibiot. (Tokyo). 58, 346–349 (2005). 88. Bringmann, G. et al. Gephyromycin, the first bridged angucyclinone, from Streptomyces griseus strain NTK 14. Phytochemistry 66, 1366–1373 (2005). 89. Ivanova, V. et al. Microbiaeratin, a new natural indole alkaloid from a Microbispora aerata strain, isolated from Livingston Island, Antarctica. Prep. Biochem. Biotechnol. 37, 161–8 (2007). 90. Li, Y. et al. Bioactive asterric acid derivatives from the antarctic ascomycete Geomyces sp. J. Nat. Prod. 71, 1643–1646 (2008). 91. Li, L., Li, D., Luan, Y., Gu, Q. & Zhu, T. Cytotoxic metabolites from the antarctic psychrophilic fungus Oidiodendron truncatum. J. Nat. Prod. 75, 920–927 (2012). 92. Wu, G. et al. Penilactones A and B, two novel polyketides from Antarctic deep-sea derived fungus Penicillium crustosum PRB-2. Tetrahedron 68, 9745–9749 (2012). 93. Walker, J. J. & Pace, N. R. Endolithic Microbial Ecosystems. Annu. Rev. Microbiol. 61, 331–347 (2007). 94. Adams, B. et al. Co-variation in soil biodiversity and biogeochemistry in northern and southern Victoria Land, Antarctica. Antarct. Sci. 18, 535–548 (2006). 95. Weisburg, W. G., Barns, S. M., Pelletier, D. A. & Lane, D. J. 16S Ribosomal DNA Amplification for Phylogenetic Study. J. Bacteriol. 173, 697–703 (1991). 96. Wang, Y. & Qian, P. Y. Conservative fragments in bacterial 16S rRNA genes and primer design for 16S ribosomal DNA amplicons in metagenomic studies. PLoS One 4, (2009). 97. Caporaso, J. G. et al. QIIME allows analysis of high- throughput community sequencing data. Nat. Publ. Gr. 7, 335–336 (2010). 98. Seath, A. & Sokal, R. . Numerical Taxonomy. The Principles and Practice of Numerical Classification. Systamatic Zoology 24, (1973). 99. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010). 100. DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006). 101. Wang, Q., Garrity, G. M., Tiedje, J. M., Cole, J. R. & Al, W. E. T. Naı ¨ ve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy ᰔ †. Appl. Environ. Microbiol. 73, 5261–5267 (2007). 102. Chao, A. Non-parametric estimation of the classes in a population. Scand. J. Stat. 11, 265–270 (1984). 103. Lozupone, C. & Knight, R. UniFrac : a New Phylogenetic Method for Comparing Microbial Communities UniFrac : a New Phylogenetic Method for Comparing Microbial Communities. Appl. Environ. Microbiol. 71, 8228–8235 (2005). 104. Langille, M. G. I. et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013). 105. Hamaki, T. et al. Isolation of novel bacteria and actinomycetes using soil-extract agar medium. J. Biosci. Bioeng. 99, 485–492 (2005). 106. Shirling, E. B. & Gottlieb, D. Methods for characterizaion of Streptomyces species. Int. J. Syst. Bacteriol. 16, 313–340 (1966). 107. Hameş-Kocabaş, E. E. & Uzel, A. Isolation strategies of marine-derived actinomycetes from sponge and sediment samples. J. Microbiol. Methods 88, 342–347 (2012). 108. Axenov-Gribanov, D. et al. The isolation and characterization of actinobacteria from dominant benthic macroinvertebrates endemic to Lake Baikal. Folia Microbiol. (Praha). 61, 159–168 (2016). 109. Weisburg, W. G., Barns, S. M., Pelletie, D. a & Lane, D. J. 16S ribosomal DNA amplification for phylogenetic study. J. Bacteriol. 173, 697–703 (1991). 110. Kumar, S., Stecher, G. & Tamura, K. MEGA7: Molecular Evolutionary Genetics Analysis FCUP 81 Exploring Polar Microbiomes as Source of Bioactive Molecules

version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874 (2016). 111. Tamura, K. & Nei, M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10, 512–26 (1993). 112. Muyzer, G., De Waal, E. C. & Uitterlinden, A. G. Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl. Environ. Microbiol. 59, 695–700 (1993). 113. White, T. J., Bruns, T., Lee, S. & Taylor, J. in PCR - Protocols and Applications - A Laboratory Manual 315–322 (1990). 114. O’Donnel, K. in Fusarium and its near relatives 225–233 (1993). 115. Nielsen, J. C. et al. Global analysis of biosynthetic gene clusters reveals vast potential of secondary metabolite production in Penicillium species. Nat. Microbiol. 2, 17044 (2017). 116. Lozupone, C. & Knight, R. Species Divergence and the Measurement of Microbial Diversity. FEMS Microbiol. Rev. 32, 557–578 (2008). 117. Menna, P., Barcellos, F. G. & Hungria, M. Phylogeny and taxonomy of a diverse collection of Bradyrhizobium strains based on multilocus sequence analysis of the 16S rRNA gene, ITS region and glnII, recA, atpD and dnaK genes. Int. J. Syst. Evol. Microbiol. 59, 2934–2950 (2009). 118. Pointing, S. B. et al. Highly specialized microbial diversity in hyper-arid polar desert. Proc. Natl. Acad. Sci. U. S. A. 106, 19964–19969 (2009). 119. De Los Ríos, A., Wierzchos, J. & Ascaso, C. The lithic microbial ecosystems of Antarctica’s McMurdo Dry Valleys. Antarct. Sci. 19, 1–19 (2014). 120. Chan, Y., Van Nostrand, J. D., Zhou, J., Pointing, S. B. & Farrell, R. L. Functional ecology of an Antarctic Dry Valley. Proc. Natl. Acad. Sci. U. S. A. 110, 8990–5 (2013). 121. Van Goethem, M. W., Makhalanyane, T. P., Valverde, A., Cary, S. C. & Cowan, D. A. Characterization of bacterial communities in lithobionts and soil niches from Victoria Valley, Antarctica. FEMS Microbiol. Ecol. 92, 1–22 (2016). 122. Chan, Y. et al. Hypolithic microbial communities: Between a rock and a hard place. Environ. Microbiol. 14, 2272–2282 (2012). 123. De Los Ríos, A., Grube, M., Sancho, L. G. & Ascaso, C. Ultrastructural and genetic characteristics of endolithic cyanobacterial biofilms colonizing Antarctic granite rocks. FEMS Microbiol. Ecol. 59, 386–395 (2007). 124. Friedmann, I. Endolithic Mircrobial Life in Hot and Cold Deserts. Orig. Life 10, 223–235 (1980). 125. Taton, A. et al. Biogeographical distribution and ecological ranges of benthic cyanobacteria in East Antarctic lakes. FEMS Microbiol. Ecol. 57, 272–289 (2006). 126. Zakhia, F., Jungblut, A., Taton, A., Vincent, W. F. & Wilmotte, A. Cyanobacteria in Cold Ecosystems. Psychrophiles from Biodivers. to Biotechnol. 121–135 (2008). 127. Vincent, W. F., Downes, M. T., Castenholz, R. W. & Howard-Williams, C. Community structure and pigment organisation of cyanobacteria-dominated microbial mats in Antarctica. Eur. J. Phycol. 28, 213–221 (1993). 128. Wood, S. a, Rueckert, A., Cowan, D. a & Cary, S. C. Sources of edaphic cyanobacterial diversity in the Dry Valleys of Eastern Antarctica. ISME J. 2, 308–320 (2008). 129. Fierer, N. et al. Comparative metagenomic, phylogenetic and physiological analyses of soil microbial communities across nitrogen gradients. ISME J. 6, 1007–1017 (2012). 130. Smith, J. J., Tow, L. A., Stafford, W., Cary, C. & Cowan, D. a. Bacterial diversity in three different antarctic cold desert mineral soils. Microb. Ecol. 51, 413–421 (2006). 131. Cary, S. C., McDonald, I. R., Barrett, J. E. & Cowan, D. A. On the rocks: the microbiology of Antarctic Dry Valley soils. Nat. Rev. Microbiol. 8, 129–138 (2010). 132. Gottardi, E. M. et al. Abyssomicin Biosynthesis: Formation of an Unusual Polyketide, FCUP 82 Exploring Polar Microbiomes as Source of Bioactive Molecules

Antibiotic-Feeding Studies and Genetic Analysis. ChemBioChem 12, 1401–1410 (2011). 133. Nemoto, A. et al. Asterobactin, a New Siderophore Group Antibiotic from Nocardia asteroides. J. Antibiot. (Tokyo). 55, 593–597 (2002). 134. Lu, C. & Shen, Y. A novel ansamycin, naphthomycin K from Streptomyces sp. J. Antibiot. (Tokyo). 60, 649–653 (2007). 135. Brigham, R. & Pittenger, R. Streptomyces orientalis, n. sp., the source of vancomycin. Antibiot. Chemother. 6, 642–647 (1956). 136. Kominek, L. A. Biosynthesis of novobiocin by Streptomyces niveus. Antimicrob. Agents Chemother. 1, 123–134 (1972). 137. Darken, M. a, Berenson, H., Shirk, R. J. & Sjolander, N. O. Production of tetracycline by Streptomyces aureofaciens in synthetic media. Appl. Microbiol. 8, 46–51 (1960). 138. Waksman, S. A., Reilly, C. H. & Johnstone, D. B. Isolation of streptomycin-producing strains of Streptomyces griseus. J. Bacteriol. 52, 393–397 (1946). 139. Friedmann, E. I. Endolithic microorganisms in the antarctic cold desert. Science 215, 1045–1053 (1982). 140. R.M., J., J.M., M. & Swafford, J. R. Taxonomy of Antarcic Bacteria from soils and air primarily of the McMurdo station and Victoria Land Dry Valleys Region. Antarct. Res. Ser. Terr. Biol. II 30, 35–64 (1972). 141. Pulschen, A. A. et al. Isolation of uncultured bacteria from antarctica using long incubation periods and low nutritional media. Front. Microbiol. 8, 1–12 (2017). 142. Schumann, P., Prauser, H., Iuiiney, F. A., Stackebrandt, E. & Hirsch, P. Friedmanniella antarctica gen. nov.,sp. noc., an LL-Diaminopimelic Acid-Containing Actinomycete from Antarctic Sandstone. Int. J. Syst. Bacteriol. 47, 278–283 (1997). 143. Aislabie, J. M. et al. Dominant bacteria in soils of Marble Point and Wright Valley, Victoria Land, Antarctica. Soil Biol. Biochem. 38, 3041–3056 (2006). 144. Saul, D. J., Aislabie, J. M., Brown, C. E., Harris, L. & Foght, J. M. Hydrocarbon contamination changes the bacterial diversity of soil from around Scott Base, Antarctica. FEMS Microbiol. Ecol. 53, 141–155 (2005). 145. Reddy, G. S. N., Poorna Manasa, B., Singh, S. K. & Shivaji, S. Paenisporosarcina indica sp. nov., a psychrophilic bacterium from a glacier, and reclassification of Sporosarcina antarctica Yu et al., 2008 as Paenisporosarcina antarctica comb. nov. and emended description of the genus Paenisporosarcina. Int. J. Syst. Evol. Microbiol. 63, 2927–2933 (2013). 146. Reddy, G. S. N., Matsumoto, G. I. & Shivaji, S. Sporosarcina macmurdoensis sp. nov., from a cyanobacterial mat sample from a pond in the McMurdo Dry Valleys, Antarctica. Int. J. Syst. Evol. Microbiol. 53, 1363–1367 (2003). 147. Krishnamurthi, S. et al. Description of Paenisporosarcina quisquiliarum gen. nov., sp. nov., and reclassification of Sporosarcina macmurdoensis Reddy et al. 2003 as Paenisporosarcina macmurdoensis comb. nov. Int. J. Syst. Evol. Microbiol. 59, 1364– 1370 (2009). 148. Link, H. F. Observationes in ordines plantarum naturales. Mag. der Gesellschaft Naturforschenden Freunde Berlin 31, 3–42 (1809). 149. Pitt, J. I., Samson, R. A. & Frisvad, J. C. in Integration of modern taxonomic methods for Penicillium and Aspergillus classification 9–79 (2000). 150. McRae, C. F., Seppelt, R. D. & Hocking, A. D. Penicillium species from terrestrial habitats in the Windmill Islands, East Antarctica, including a new species, Penicillium antarcticum. Polar Biol. 21, 97–111 (1999). 151. Houbraken, J. et al. New penicillin-producing Penicillium species and an overview of section Chrysogena. Persoonia Mol. Phylogeny Evol. Fungi 29, 78–100 (2012). 152. Mello, S. et al. Antagonistic process of Dicyma pulvinata against Fusicladium macrosporum on rubber tree. Fitopatol. Bras. 33, 5–11 (2008). 153. Liu, H., Xu, Y., Ma, Y. & Zhou, P. Characterization of Micrococcus antarcticus sp. nov., a FCUP 83 Exploring Polar Microbiomes as Source of Bioactive Molecules

psychrophilic bacterium from Antarctica. Int. J. Syst. Evol. Microbiol. 50, 715–719 (2000). 154. Cohn, F. Untersuchungen uber Bakterien. Beitr Biol Pflanz 1, 127–244 (1872). 155. Taxonomic Dissection of the Genus Micrococcus : Kocuria gen . 45, 46100 (1872). 156. Reddy, G. S. N. et al. Kocuria polaris sp. nov., an orange-pigmented psychrophilic bacterium isolated from an Antarctic cyanobacterial mat sample. Int. J. Syst. Evol. Microbiol. 53, 183–187 (2003). 157. Vasileva-Tonkova, E. et al. Ecophysiological properties of cultivable heterotrophic bacteria and yeasts dominating in phytocenoses of Galindez Island, maritime Antarctica. World J. Microbiol. Biotechnol. 30, 1387–1398 (2014). 158. Anzai, K. et al. Flexivirga alba gen. nov., sp. nov., an actinobacterial taxon in the family Dermacoccaceae. J.Antibiot.(Tokyo) 64, 613–616 (2011). 159. Jordan, D. C. Transfer of Rhizobium japonicum Buchananm 1980 to Bradyrhizobium japonicum gen. Nov., a genus of slow growing root nodule bateria. Int. J. Syst. Bacteriol. 378–380 (1982). 160. Lee, L. H. et al. Analysis of Antarctic Proteobacteria by PCR fingerprinting and screening for antimicrobial secondary metabolites. Genet. Mol. Res. 11, 1627–1641 (2012). 161. Delamuta, J. R. M. et al. Bradyrhizobium tropiciagri sp. nov. and bradyrhizobium embrapense sp. nov nitrogenfixing symbionts of tropical forage legumes. Int. J. Syst. Evol. Microbiol. 65, 4424–4433 (2015). 162. Peterson, S. W., Bayer, Eileen, M. & Wicklow, D. T. Penicillium thiersii, Penicillium angulare and Penicillium decaturense, new species isolate from wood-decay fungi in North America and their phylogenetic placememnt from moltilocus DNA sequences analysis. Mycologia 96, 1280–1293 (2006). 163. Lewis, K., Epstein, S., D’Onofrio, A. & Ling, L. L. Uncultured microorganisms as a source of secondary metabolites. J. Antibiot. (Tokyo). 63, 468–476 (2010). 164. Davis, K. E. R., Sangwan, P. & Janssen, P. H. Acidobacteria, Rubrobacteridae and Chloroflexi are abundant among very slow-growing and mini-colony-forming soil bacteria. Environ. Microbiol. 13, 798–805 (2011). 165. Tian, Y., Li, Y. L. & Zhao, F. C. Secondary metabolites from polar organisms. Mar. Drugs 15, (2017). 166. Palomo, S. et al. Sponge-derived Kocuria and Micrococcus spp. as sources of the new thiazolyl peptide antibiotic kocurin. Mar. Drugs 11, 1071–1086 (2013). 167. Matobole, R. M., Van Zyl, L. J., Parker-Nance, S., Davies-Coleman, M. T. & Trindade, M. Antibacterial Activities of Bacteria Isolated from the Marine Sponges Isodictya compressa and Higginsia bidentifera Collected from Algoa Bay, South Africa. Mar. Drugs 15, 8–10 (2017). 168. Graça, A. P. et al. Antimicrobial activity of heterotrophic bacterial communities from the marine sponge Erylus discophorus (Astrophorida, Geodiidae). PLoS One 8, (2013). 169. Ranjan, R. & Jadeja, V. Isolation, characterization and chromatography based purification of antibacterial compound isolated from rare endophytic actinomycetes Micrococcus yunnanensis. J. Pharm. Anal. 1–5 doi:10.1016/j.jpha.2017.05.001 170. Salam, N. et al. Endophytic Actinobacteria associated with Dracaena cochinchinensis Lour.: Isolation, diversity, and their cytotoxic activities. Biomed Res. Int. 2017, (2017). 171. Vázquez, M. J. et al. A Novel Ergot Alkaloid as a 5-HT1A Inhibitor Produced by Dicyma sp. J. Med. Chem. 46, 5117–5120 (2003). 172. Zhang, Y. et al. Novel antiinsectan oxalicine alkaloids from two undescribed fungicolous Penicillium spp. Org. Lett. 5, 773–776 (2003). 173. De Felício, R. et al. Antibacterial, antifungal and cytotoxic activities exhibited by endophytic fungi from the Brazilian marine red alga Bostrychia tenella (Ceramiales). Brazilian J. Pharmacogn. 25, 641–650 (2015). 174. Tsuda, M. et al. Citrinadin A, a Novel Pentacyclic Alkaloid from Marine-Derived Fungus FCUP 84 Exploring Polar Microbiomes as Source of Bioactive Molecules

Penicillium citrinum. Org. Lett. 6, 3087–3089 (2004). 175. Sakadi, M., Tsuda, M., Sekiguchi, M., Mikami, Y. & Kobayashi, J. Perinadine A, a novel tetracyclic alkaloid from marine-derived fungus Penicillium citrinum. Org. Lett. 7, 4261– 4264 (2005). 176. Samanthi, K. A. U., Wickramaarachchi, S., Wijeratne, E. M. K. & Paranagama, P. A. Two new antioxidant active polyketides from penicillium citrinum, an endolichenic fungus isolated from parmotrema species in Sri Lanka. J. Natl. Sci. Found. Sri Lanka 43, 119– 126 (2015). 177. Lai, D., Brötz-Oesterhelt, H., Müller, W. E. G., Wray, V. & Proksch, P. Bioactive polyketides and alkaloids from Penicillium citrinum, a fungal endophyte isolated from Ocimum tenuiflorum. Fitoterapia 91, 100–106 (2013). 178. Lu, Z. Y. et al. Citrinin dimers from the halotolerant fungus Penicillium citrinum B-57. J. Nat. Prod. 71, 543–546 (2008). 179. Ranji, P., Wijeyaratne, S., Jayawardana, K. & Gunaherath, G. Citriquinones A and B, new benzoquinones from Penicillium citrinum. Nat Prod Commun 8, 1431–1434 (2013). 180. Abdel-Mageed, W. M. et al. Dermacozines, a new phenazine family from deep-sea dermacocci isolated from a Mariana Trench sediment. Org. Biomol. Chem. 8, 2352 (2010). 181. Ramos, V. M. C. et al. Cyanobacterial diversity in microbial mats from the hypersaline lagoon system of Araruama, Brazil: An in-depth polyphasic study. Front. Microbiol. 8, 1– 16 (2017). 182. Saitou, N. & Nei, M. The Neighbor-joining Method : A New Method for Reconstructing Phylogenetic Trees. Mol. Biol. Evol. 4, 406–425 (1987).

FCUP 85 Exploring Polar Microbiomes as Source of Bioactive Molecules

VIII – Supplementary Information

Table S 1- Information of nucleotide sequences of KS domain collected for primer design. The respective KS class, the organism from which was obtained, the accession number, the pathway product (when existent) and the phylum.

Domain Type Organism Acession number Pathway product Region selected Phyla Name Microcystis aeruginosa NIES-843 KS Type I-Enediyne Microcystis aeruginosa NIES-843 AP009552.1 AP009552.1 Cyanobacteria

Herpetosiphon aurantiacus DSM 785 KS Type I-Enediyne Herpetosiphon aurantiacus DSM 785 CP000875.1 984212-985579 Chloroflexi

calE8 KS Type I-Enediyne Micromonospora echinospora AF497482.1 Calicheamicin 43127-44503 Actinobacteria

Haliangium ochraceum DSM 14365 KS Type I-Enediyne Haliangium ochraceum DSM 14365 CP001804.1 909311-910675 Delta protebacteria

Methylobacterium radiotolerans JCM KS Type I-Enediyne Methylobacterium radiotolerans JCM CP001001.1 1007519-1008784 Alphaproteobacteria 2831 2831 Pandoraea oxalativorans strain DSM KS Type I-Enediyne Pandoraea oxalativorans strain DSM CP011253.3 4597936-4599306 Betaproteobacteria 23570 23570 Methylococcus capsulatus str. Bath KS Type I-Enediyne Methylococcus capsulatus str. Bath AE017282.2 1311699-1313072 Gammaproteobacteria

hglE KS Type I-PUFA Nostoc sp. 'Peltigera membranacea KC489223.1 157-1527 Cyanobacteria cyanobiont' glycolipid gene cluster Streptomyces pristinaespiralis strain KS Type I-PUFA Streptomyces pristinaespiralis strain CP011340.1 4068178-4069542 Actinobacteria HCCB 10218 HCCB 10218 Roseiflexus castenholzii DSM 13941 KS Type I-PUFA Roseiflexus castenholzii DSM 13941 CP000804.1 3728426-3729799 Chloroflexi

pfaA Aureispira marina KS Type I-PUFA Aureispira marina strain: JCM 23201 AB980240.1 2511-3884 Bacteroidetes

Elusimicrobium minutum Pei191 KS Type I-PUFA Elusimicrobium minutum Pei191 CP001055.1 1202229-1203605 Elusimicrobia

Planctomyces sp. SH-PL62 KS Type I-PUFA Planctomyces sp. SH-PL62 CP011273.1 356668-358002 PVC

Desulfatibacillum alkenivorans AK-01 KS Type I-PUFA Desulfatibacillum alkenivorans AK-01 CP001322.1 1417475-1418836 Deltaprotebacteria

Xanthobacter autotrophicus Py2 KS Type I-PUFA Xanthobacter autotrophicus Py2 CP000781.1 405518-406789 Alphaproteobacteria

Rubrivivax gelatinosus IL144 KS Type I-PUFA Rubrivivax gelatinosus IL144 AP012320.1 3308219-3309571 Betaproteobacteria

pfaA Shewanella violacea DSS12 KS Type I-PUFA Shewanella violacea DSS12 AP011177.1 1411850-1413214 Gammaproteobacteria

Gloeobacter violaceus PCC 7421 KS Type I- Gloeobacter violaceus PCC 7421 BA000045.2 2086856-2088127 Cyanobacteria Trans-AT FCUP 86 Exploring Polar Microbiomes as Source of Bioactive Molecules

KS Type I- Bacillus amyloliquefaciens strain AJ634060.2 Bacillaene 37992-39296 Firmicutes Trans-AT FZB42 ituA KS Type I-Trans-AT Bacillus subtilis AB050629.1 Iturin 7162-8427 Firmicutes no similarity found KS Type I-Trans-AT no similarity found no similarity found no similarity found Chloroflexi lnmI KS Type I-Trans-AT Streptomyces atroolivaceus AF484556.1 Leinamycin 75474-76712 Actinobacteria elaJ KS Type I-Trans-AT Chitinophaga sancti HQ680975.1 12199-13485 Bacteroidetes no similarity found KS Type I-Trans-AT no similarity found no similarity found no similarity found no similarity found Elusimicrobia

No Trans AT KS Domain found KS Type I-Trans-AT No Trans AT KS Domain found No Trans AT KS No trans AT KS No Trans AT KS PVC Domain found Domain found Domain found no similarity found KS Type I-Trans-AT no similarity found no similarity found no similarity found no similarity foun Omnitrophica dszA KS Type I-Trans-AT Polyangium cellulosum strain So DQ013294.1 Disorazole 15746-17014 Deltaprotebacteria ce12 Xanthobacter autotrophicus Py2 KS Type I-Trans-AT Xanthobacter autotrophicus Py2 CP000781.1 4217707-4218963 Alphaproteobacteria

Burkholderia gladioli BSR3 KS Type I-Trans-AT Burkholderia gladioli BSR3 CP002599.1 2432482-2433726 Betaproteobacteria

Lysobacter sp. ATCC 53042 KS Type I-Trans-AT Lysobacter sp. ATCC 53042 JF412274.1 lysobactin 134515-135771 Gammaproteobacteria

blmVIII KS Type I -Hybrids Streptomyces verticillus AF210249.1 Bleomycin 34151-35431 Actinobacteria nosB KS Type I -Hybrids Nostoc sp. GSV224 AF204805.2 Nostopeptolide 14548-15828 Cyanobacteria

Paenibacillus mucilaginosus K02 KS Type I -Hybrids Paenibacillus mucilaginosus K02 CP003422.2 5234182-5235423 Firmicutes

Herpetosiphon aurantiacus DSM 785 KS Type I -Hybrids Herpetosiphon aurantiacus DSM 785 CP000875.1 3095245-3096519 Chloroflexi

Hymenobacter sp. PAMC 26554 KS Type I -Hybrids Hymenobacter sp. PAMC 26554 CP014771.1 3880459-3881736 Bacteroidetes no similarity found KS Type I -Hybrids no similarity found no similarity found no similarity found no similarity found Elusimicrobia

Opitutus terrae PB90-1 KS Type I -Hybrids Opitutus terrae PB90-1 CP001032.1 2482274-2483554 PVC no similarity found KS Type I -Hybrids no similarity found no similarity found no similarity found no similarity found Omnitrophica mtaD KS Type I -Hybrids Stigmatella aurantiaca AF188287.1 Myxothiazol 4853505-4854785 Deltaprotebacteria

Paracoccus denitrificans PD1222 KS Type I -Hybrids Paracoccus denitrificans PD1222 CP000489.1 814611-815894 Alphaproteobacteria var4 KS Type I -Hybrids Variovorax paradoxus strain P4B KT362218.1 Variobactin 6789-8090 Betaproteobacteria

Alcanivorax pacificus W11-5 KS Type I -Hybrids Alcanivorax pacificus W11-5 CP004387.1 1162913-1164202 Gammaproteobacteria

FCUP 87 Exploring Polar Microbiomes as Source of Bioactive Molecules

aviM KS Type I -Iterative Streptomyces viridochromogenes AF333038.2 Avilamycin A 17696-19000 Actinobacteria Tue57 no Iterative KS domain found KS Type I -Iterative no Iterative KS domain found no Iterative KS no Iterative KS no Iterative KS domain Cyanobacteria domain found domain found found Paenibacillus mucilaginosus K02 KS Type I -Iterative Paenibacillus mucilaginosus K02 CP003422.2 4656080-4657369 Firmicutes no Iterative KS domain found KS Type I -Iterative no Iterative KS domain found no Iterative KS no Iterative KS no Iterative KS domain Chloroflexi domain found domain found found no Iterative KS domain found KS Type I -Iterative no Iterative KS domain found no Iterative KS no Iterative KS no Iterative KS domain Bacteroidetes domain found domain found found no similarity found KS Type I -Iterative no similarity found no similarity found no similarity found no similarity found Elusimicrobia

Opitutus terrae PB90-1 KS Type I -Iterative Opitutus terrae PB90-1 CP001032.1 2460043-2461272 PVC - Verrucomicrobia

Singulisphaera acidiphila DSM 18658 KS Type I -Iterative Singulisphaera acidiphila DSM 18658 CP003364.1 2563678-2564958 PVC -Planctomycetes no similarity found KS Type I -Iterative no similarity found no similarity found no similarity found no similarity found Omnitrophica

Corallococcus coralloides DSM 2259 KS Type I -Iterative Corallococcus coralloides DSM 2259 CP003389.1 2629423-2630748 Deltaprotebacteria

Burkholderia cenocepacia strain ST32 KS Type I -Iterative Burkholderia cenocepacia strain CP011917.1 3566809-3568086 Deltaprotebacteria ST32 Tistrella mobilis KA081020-065 KS Type I -Iterative Tistrella mobilis KA081020-065 CP003236.1 538323-539606 Alphaproteobacteria

Nitrosomonas europaea ATCC 19718 KS Type I -Iterative Nitrosomonas europaea ATCC AL954747.1 1515934-1517205 Betaproteobacteria 19718 HSAF KS Type I -Iterative Lysobacter enzymogenes strain C3 EF028635.2 HSAF 15213-16493 Gammaproteobacteria

vinP2 KS Type I - Modular Streptomyces halstedii AB086653.1 vicenistatin 4517-5794 Actinobacteria

JamG KS Type I - Modular Lyngbya majuscula AY522504.1 Jamaicamides 18222-19448 Cyanobacteria

Bacillus velezensis strain CC09 KS Type I - Modular Bacillus velezensis strain CC09 CP015443.1 1045643-1046884 Firmicutes

Herpetosiphon aurantiacus DSM 785 KS Type I - Modular Herpetosiphon aurantiacus DSM 785 CP000875.1 4967704-4968978 Chloroflexi

Spirosoma radiotolerans strain DG5A KS Type I - Modular Spirosoma radiotolerans strain DG5A CP010429.1 581904-583184 Bacteroidetes no similarity found KS Type I - Modular no similarity found no smilarity found no similarity found no similarity found Elusimicrobia

Opitutus terrae PB90-1 KS Type I - Modular Opitutus terrae PB90-1 CP001032.1 2465352-2466632 PVC-Verrumicrobia

Rhodopirellula baltica SH 1 KS Type I - Modular Rhodopirellula baltica SH 1 BX294154.1 87460-88737 PVC- Planctomycetes no similarity found KS Type I - Modular no similarity found no smilarity found no similarity found no similarity found Omnitrophica epoA KS Type I - Modular Sorangium cellulosum strain So ce90 AF210843.1 Epothilone 7640..8920 Deltaprotebacteria hliP KS Type I - Modular Haliangium ochraceum DSM 14365 KU523553.1 Haliangicin 37 to 1347 Deltaprotebacteria strain SMP-2 Methylobacterium sp. 4-46 KS Type I - Modular Methylobacterium sp. 4-46 CP000943.1 199918-201195 Alphaproteobacteria FCUP 88 Exploring Polar Microbiomes as Source of Bioactive Molecules

Achromobacter xylosoxidans strain KS Type I - Modular Achromobacter xylosoxidans strain CP014060.1 6447172-6448449 Betaproteobacteria FDAARGOS_147 FDAARGOS_147 pltC KS Type I - Modular Pseudomonas fluorescens Pf-5 AF081920.3 Pyoluteorin 12384-13667 Gammaproteobacteria

tmnAI KS Type I KS1 Streptomyces sp. NRRL 1126 AB193609.1 Tetronomycin 20826-22082 Actinobacteria

jamE KS Type I KS1 Lyngbya majuscule AY522504.1 Jamaicamide 12614-13879 Cyanobacteria

no KS1 domain found KS Type I KS1 no KS1 Domain found no KS1 domain no KS1 domain no KS1 domain found Firmicutes found found Herpetosiphon aurantiacus DSM 785 KS Type I KS1 Herpetosiphon aurantiacus DSM 785 CP000875.1 3075895-3077166 Chloroflexi

no KS1 domain found KS Type I KS1 no KS1 Domain found no KS domain no KS domain found no KS1 domain found Bacteroidetes found no similarity found KS Type I KS1 no similarity found no similarity found no similarity found no similarity found Elusimicrobia

Singulisphaera acidiphila DSM 18658 KS Type I KS1 Singulisphaera acidiphila DSM 18658 CP003364.1 8923641-8924897 PVC

no similarity found KS Type I KS1 no similarity found no similarity found no similarity found no similarity found Omnitrophica

gulB KS Type I KS1 Pyxidicoccus fallax strain DSM 28991 KM361622.1 Gulmirecin 2167-3429 Deltaprotebacteria

sorA KS Type I KS1 Sorangium cellulosum U24241.2 soraphen 14661-15929 Deltaprotebacteria

stiA KS Type I KS1 Stigmatella aurantiaca AJ421825.1 Stigmatellin 14603-15868 Deltaprotebacteria

no KS1 domain found KS Type I KS1 no KS1 domain found no KS1 domain no KS1 domain no KS1 domain found Betaproteobacteria found found no KS1 domain found KS Type I KS1 no KS1 domain found no KS1 domain no KS1 domain no KS1 domain found Gammaproteobacteria found found

Table S 2 - Information of nucleotide sequences of AD domain collected for primer design. The organism from which was obtained, the accession number, the pathway product and the phylum.

Name Domain Organism Acession number Pathway product Region selected Phyla

comA AD Streptomyces lavendulae AF386507.1 complestatin 8871-10046 Actinobacteria

comC AD Streptomyces lavendulae AF386507.1 complestatin 18154-19278 Actinobacteria

nocA AD Nocardia uniformis subsp. Tsuyamanensis AY541063.1 nocardicin A 10551-11810 Actinobacteria

nocB AD Nocardia uniformis subsp. Tsuyamanensis AY541063.1 nocardicin A 21674-22891 Actinobacteria

acmA AD Streptomyces anulatus HM038106.1 actinomycin 13560-14651 Actinobacteria FCUP 89 Exploring Polar Microbiomes as Source of Bioactive Molecules

acmB AD Streptomyces anulatus HM038106.1 actinomycin 16633-17874 Actinobacteria

Streptomyces coelicolor A3 AD Streptomyces coelicolor A3 AL939127.1 229148-230332 Actinobacteria (SCO6431) qui6 AD Streptomyces griseovariabilis subsp. bandungensis JN852959.1 quinomycin 10001-11221 Actinobacteria vlm1 AD Streptomyces tsusimaensis strain ATCC 15141 DQ174261.1 valinomycin 19697-21004 Actinobacteria

Pyre AD Streptomyces pyridomyceticus strain NRRL B-2517 HM436809.1 pyridomycin 10597-11712 Actinobacteria cmnA AD Saccharothrix mutabilis subsp. Capreolus EF472579.1 capreomycin 14609-15811 Actinobacteria

mcnA AD Microcystis aeruginosa K-139 AB481215.1 micropeptin 4316-5548 Cyanobacteria mcnB AD Microcystis aeruginosa K-139 AB481215.1 micropeptin 7636-8943 Cyanobacteria adpA AD Anabaena sp. AJ269505.1 anabaenopeptilides, 11171-12379 Cyanobacteria adpB AD Anabaena sp. AJ269505.2 anabaenopeptilides, 17622-18848 Cyanobacteria aptA1 AD Anabaena sp. 90 GU174493.1 Anabaenopeptin 913-2163 Cyanobacteria aptA2 AD Anabaena sp. 90 GU174493.1 Anabaenopeptin 7623-8807 Cyanobacteria ndaA AD Nostoc sp. 73.1 JF342711.1 Nodularin 2196-3449 Cyanobacteria ndaB AD Nostoc sp. 73.1 JF342711.1 Nodularin 10808-12007 Cyanobacteria ociB AD Planktothrix rubescens NIVA-CYA 98 AM990463.2 cyanopeptolin 51972-53201 Cyanobacteria nosA AD Nostoc sp. GSV224 AF204805.2 nostopeptolide 2885-4135 Cyanobacteria ablD AD Anabaena sp. XPORK15F KP761740.1 anabaenolysin 16089-17303 Cyanobacteria

dhbF AD Bacillus subtilis subsp. subtilis strain 168G CP016852.1 bacillibactin 3284372-3285571 Firmicutes bacA AD Bacillus licheniformis AF007865.2 bacitracin 1640-2833 Firmicutes bacB AD Bacillus licheniformis AF007865.2 bacitracin 18968-20140 Firmicutes bacC AD Bacillus licheniformis AF007865.2 bacitracin 26713-27906 Firmicutes

Fend AD Bacillus subtilis AJ011849.1 fengycin 1520-2701 Firmicutes licA AD Bacillus licheniformis U95370.1 lichenysin 1920-3137 Firmicutes licB AD Bacillus licheniformis U95370.1 lichenysin 12703-13887 Firmicutes FCUP 90 Exploring Polar Microbiomes as Source of Bioactive Molecules

crs1 AD Bacillus cereus AB248763.2 cereulide 792-2123 Firmicutes crs2 AD Bacillus cereus AB248763.2 cereulide 11191 to 12516 Firmicutes mycB AD Bacillus subtilis AF184956.1 mycosubtilin 16000-17190 Firmicutes licC AD Bacillus licheniformis U95370.1 lichenysin 22080-22976 Firmicutes ituC AD Bacillus subtilis AB050629.1 iturin 34189-35382 Firmicutes bamB AD Bacillus subtilis AY137375.1 bacillomycin D 15586-16776 Firmicutes tycB AD Brevibacillus brevis AF004835.1 tyrocidine 5752-6954 Firmicutes

Herpetosiphon aurantiacus AD Herpetosiphon aurantiacus DSM 785 CP000875.1 1826448-1827653 Chloroflexi DSM 785 (Haur_1574) Herpetosiphon aurantiacus AD Herpetosiphon aurantiacus DSM 785 CP000875.1 2096153-2097331 Chloroflexi DSM 785 (Haur_1805) Herpetosiphon aurantiacus AD Herpetosiphon aurantiacus DSM 785 CP000875.1 2610282-2611490 Chloroflexi DSM 785 (Haur_2091) Herpetosiphon aurantiacus AD Herpetosiphon aurantiacus DSM 785 CP000875.1 3952066-3953283 Chloroflexi DSM 785 (Haur_3129) Hymenobacter sp. PAMC AD Hymenobacter sp. PAMC 26554 CP014771.1 3874195-3875382 Bacteroidetes 26554(A0257_16655)

Mucilaginibacter gotjawali Mucilaginibacter gotjawali AP017313.1 346611-3467298 Bacteroidetes (MgSA37_03154) Mucilaginibacter AD Mucilaginibacter gotjawali AP017313.1 3452109-3453299 Bacteroidetes gotjawali(MgSA37_03143) Spirosoma radiotolerans AD Spirosoma radiotolerans strain DG5A CP010429.1 595989-597197 Bacteroidetes strain DG5A (SD10_02305) Spirosoma radiotolerans AD Spirosoma radiotolerans strain DG5A CP010429.1 2463102-2464319 Bacteroidetes strain DG5A (SD10_09935) Filimonas lacunae strain AD Filimonas lacunae strain NBRC 104114 AP017422.1 1587072-1588256 Bacteroidetes NBRC 104114 (FLA_1304) Filimonas lacunae NBRC AD Filimonas lacunae strain NBRC 104114 AP017422.1 Bacteroidetes 104114 (FLA_1939) Winogradskyella sp. PG- AD Winogradskyella sp. PG-2 AP014583.1 393781-394974 Bacteroidetes 2 (WPG_0383) Chryseobacterium AD Chryseobacterium gallinarum strain DSM 27622 CP009928.1 3524512-3525699 Bacteroidetes gallinarum strain DSM 27622 (OK18_15880) Flammeovirgaceae AD Flammeovirgaceae bacterium 311 CP004371.1 6648482-6649669 Bacteroidetes bacterium 311 (D770_00005)

Elusimicrobium minutum AD Elusimicrobium minutum Pei191 CP001055.1 1083812-1084990 Elusimicrobia Pei191(Emin_0995) Elusimicrobium minutum AD Elusimicrobium minutum Pei191 CP001055.1 1103893-1105023 Elusimicrobia Pei191(Emin_1012) FCUP 91 Exploring Polar Microbiomes as Source of Bioactive Molecules

no additional AD Domains Elusimicrobia found

Verrucomicrobia bacterium AD Verrucomicrobia bacterium L21-Fru-AB CP010904.1 1018914-1020098 PVC L21-Fru-AB Singulisphaera acidiphila AD Singulisphaera acidiphila DSM 18658 CP003364.1 2555542-2556750 PVC DSM 18658 (Sinac_2077) Opitutus terrae PB90-1 AD Opitutus terrae PB90-1 CP001032.1 2492688-2493914 PVC (Oter_1968) no aadditional AD Domains PVC found

no AD Domain found AD no AD Domain found no AD Domain found Omnitrophica

hlaA AD Haliangium ochraceum DSM 14365 KU516823.1 1525 – 2583 Deltaprotebacteria cndF AD Chondromyces crocatus Cmc5 CP012159.1 721596 -721718 Deltaprotebacteria chiD AD Sorangium cellulosum 'So ce 56' AM746676.1 580410 -5805310 Deltaprotebacteria mscF AD Jahnella sp. MSr9139 KF657739.1 33063 -34283 Deltaprotebacteria mscH AD Sorangium cellulosum strain So ce38 KF657738.1 38530-3974 Deltaprotebacteria antB AD Polyangium spumosum strain Pl sm9 KU245058.1 28337-29527 Deltaprotebacteria

Myxococcus fulvus AD Myxococcus fulvus 124B02 CP006003.1 6297328-6298533 Deltaprotebacteria 124B02(MFUL124B02_24325) ncyE AD Nannocystis sp. MB1016 KT067736.1 nannocystin 32178-33356 Deltaprotebacteria tubC AD Angiococcus disciformis AJ620477.1 Tubulysin 41827-43023 Deltaprotebacteria melC Melittangium lichenicola AJ557546.1 Melithiazol 18679-19884 Deltaprotebacteria

AD orbI AD Burkholderia cenocepacia strain 715j DQ279460.1 Ornibactin 2608-4134 Betaproteobacteria

Burkholderia cepacia GG4 AD Burkholderia cepacia GG4 CP003774.1 Cepaciachelin, 1920882-1922108 Betaproteobacteria (GI:402247746) depD AD Chromobacterium violaceum strain 968 EF210776.1 anticancer agent FK228 24462-25646 Betaproteobacteria

Collimonas sp. AD Collimonas sp. MPS11E8 FJ965830.1 4182-5405 Betaproteobacteria MPS11E8 (CCT_ORF03016) Delftia sp. Cs1-4 AD Delftia sp. Cs1-4 CP002735.1 2384799-2385992 Betaproteobacteria (DelCs14_2100) Ralstonia solanacearum AD Ralstonia solanacearum AL646053.1 1790157-1791365 Betaproteobacteria (RSp1422) Variovorax paradoxus EPS AD Variovorax paradoxus EPS CP002417.1 4906573-4907721 Betaproteobacteria (Varpa_4519) FCUP 92 Exploring Polar Microbiomes as Source of Bioactive Molecules

arfA AD Pseudomonas sp. MIS38 AB107223.1 Arthrofactin 3145-4323 Gammaproteobacteria cbsF AD Dickeya chrysanthemi AF011334.2 enterobactin esterase 3366-4607 Gammaproteobacteria

Azotobacter vinelandii CA6 AD Azotobacter vinelandii CA6 CP005095.1 :2569633-2570847 Gammaproteobacteria (AVIN_RS11710) ofaA AD Pseudomonas protegens Pf-5 CP000076.1 2369295-2370473 Gammaproteobacteria

Pantoea AD Pantoea agglomerans AY192157.1 Andrimid 19370-20530 Gammaproteobacteria agglomerans (AAO39110.1) massB AD Pseudomonas fluorescens strain SS101 EU199081.2 massetolide A 13411-14610 Gammaproteobacteria vibF AD Vibrio cholerae strain O395 AF287255.1 vibriobactin 2764-3975 Gammaproteobacteria

Erwinia amylovora ATCC Erwinia amylovora ATCC BAA-2158 FR719206.1 2297-3514 Gammaproteobacteria BAA-2158 (EAIL5_3813) Xanthomonas oryzae pv. oryzicola Xanthomonas oryzae pv. oryzicola strain RS105 CP011961.1 2132942-2134111 Gammaproteobacteria strain RS105(ACU12_09555)

Bradyrhizobium AD Bradyrhizobium oligotrophicum S58 AP012603.1 2483561-2484742 Alphaproteobacteria oligotrophicum S58 (S58_21570) Methylobacterium populi AD Methylobacterium populi BJ001 CP001029.1 5527925-5529103 Alphaproteobacteria BJ001 (Mpop_5163) Azospirillum thiophilum AD Azospirillum thiophilum strain BV-S CP012403.1 719724-720926 Alphaproteobacteria strain BV-S (AL072_22320) Methylocella silvestris BL2 AD Methylocella silvestris BL2 (Msil_0855) CP001280.1 942623-943822 Alphaproteobacteria (Msil_0855) Rhizobium leguminosarum AD Rhizobium leguminosarum strain Vaf10 CP016292.1 680-1657 Alphaproteobacteria Vaf10 (BA011_35915) Rhizobium leguminosarum AD Rhizobium leguminosarum strain Vaf10 CP016292.1 162109-163296 Alphaproteobacteria strain Vaf10 (BA011_36690) Rhizobium leguminosarum AD Rhizobium leguminosarum strain Vaf10 CP016292.1 274218-275417 Alphaproteobacteria strain Vaf10 (BA011_37190) Tistrella mobilis KA081020- AD Tistrella mobilis KA081020-065 CP003239.1 676408-677565 Alphaproteobacteria 065 (TMO_c0602) Methylobacterium AD Methylobacterium extorquens CM4 CP001298.1 5450727-5451902 Alphaproteobacteria extorquens CM4 (Mchl_5090)

FCUP 93 Exploring Polar Microbiomes as Source of Bioactive Molecules

FCUP 94 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table S 3- – Primer pairs designed for KS and AD domain. The respective region of the alignment is indicated and the annealling temperature, determined using OligoAnalyzer 3.1.

Primers Sequence Gene region Annealing temp KSF1 GARGCNCAYGGNACVGG 1086-1103 (KS) 53ºC KSF2 GCNCAYGGNACVGGBAC 1089-1106 (KS) 54ºC KSR1 TTN GTN CCN CYV AHN CC 1475-1491 (KS) 48ºC ADFw1 GGNGARHTNTDYVTBGSHGG 1170-1189 (AD) 52ºC ADRv1 CNCCBRRYTCVAYSCGVWRRCC 1429-1451 (AD) 58ºC ADRv2 CCDSCVABRHANADYTCNCC 1170-1189 (AD) 51ºC ADFw2 CNGGHWCNACVGGNVVVCC 458-477 (AD) 57ºC ADFw2.1 TCNGGHWCNACVGGNVVVCC 458-477 (AD) 58ºC ADFw3 GCNGGNGSGCBTAYSTNCC 163-182 (AD) 58ºC ADRv3 TTBGGYBBBCCBGTNGARCC 461-480 (AD) 57ºC KS_v2_Fw TCAARDCBAABVTYGGNC 1176-1194 (KS) 47ºC KS_v2_Fw2 TBAARDCBAABVTYGGNC 1176-1194 (KS) 46ºC KS_v2_Rv TTVRTGCCVCYVAHDCC 1169-1185 (KS) 50ºC NRPS_v2_Fw GCNGGNGSNGYBTAYSTNCC 163-182 (AD) 47ºC NRPS_v2_Rv CCBGTNGWRCCNGANGT 455-471 (AD) 51ºC

FCUP 95 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table S 4 – Information of bacterial strains used for primer testing, including antiSMASH genome analysis.

Organisms Phylum (Genome antiSMASH analysis Tested for PCR Accession number) Number of Number of Number of optimization PKS Type I NRPS NRPS-PKS gene gene (hybrids) gene clusters clusters cluster S. natalensis Actinobacteria 3 6 - ATCC 27448 (NZ_JRKI00000000.1) S. griseus subsp Actinobacteria 2 2 6 griseus NBRC (AP009493.1) 13350 S. avermitilis MA- Actinobacteria 4 4 2 4680 (= NBRC (BA000030.4) 14893) Chromobacterium Betaproteobacteria - 3 - violaceum CECT (NC_005085.1) 494 = ATCC 12472 Cobetia marina Gammaproteobacteria 1 - CECT 4278 (NZ_CP017114.1) Halomonas Gammaproteobacteria 1 - - aquamarina (FODB00000000.1) CECT 5000 Sagittula stellate Alphaproteobacteria 1 - CECT 7782 (NZ_AAYA00000000.1). Phaeobacter Alphaproteobacteria 1 - gallaeciensis (NC_023137.1). CECT 7277 T Roseimaritima Planctomycetes 3 - 1 ulvae UC8 (NZ_LWSJ00000000.1) Planctomycetes Planctomycetes - 2 - sp.FC18 strain (NZ_LWSI00000000.1)

The strains LEGE 06098 (Cyanobium sp.), LEGE 07179 (Hyella sp.), LEGE 06152 (Nodosilinea nodulosa), LEGE 06106 (Nostoc sp.) were selected from a previous work [22] as harbouring the PKS and NRPS genes. Brito et al. 2015 has analysed the presence of these genes from the in house LEGE cyanobacterial culture collection using DKF/DKR [3] and MTF2/MTR [25] primers. LEGE 91339 (Microcystis aeruginosa) and LEGE 06071 (Nodularia sp.) are usually used as positive control in this type of FCUP 96 Exploring Polar Microbiomes as Source of Bioactive Molecules

amplification in our group, and the PCR product has already been sequenced and the identity confirmed.

Primer KSF1/KS_v2_Rv Primer degK2F/degK2R

M 1 2 3 4 5 C- M 1 2 3 4 5 C-

1500 bp 5000 bp

1000 bp 500 bp 500 bp 200 bp

Figure S 1 - Eletrophoresis gel of PCR products of KS domain amplification using the optimized conditions. (A) – using primer pair KSF1/KS_v2Rv and (B) using primer pair degK2F/degK2R. Legend: M – 100bp and 1kb ladder (Grisp) 1 - 1 – Cobetia marina gDNA, 2 – Halomonas aquamarina gDNA, 3 – Chromobacterium violaceum gDNA, 4 - Phaeobacter gallaeciensis, 5 – gDNA from Streptomyces avermitilis MA-4680 (positive control), 6– negative control.

Table S 5 - Detailed information of alpha-diversity measure obtained for each sample in study, including number of sequence, average of Phylogenetic diversity, chao1, observed OTUs metrics.

PD_whole_tree PD_whole_tree chao1 chao1 observed_otus observed_otus SampleID Seqs/Sample Ave. Err. Ave. Err. Ave. Err. END 10.0 1.778 Nan 10.050 nan 5.200 Nan END 300.0 8.775 Nan 111.072 nan 62.000 Nan END 590.0 11.152 Nan 148.321 nan 90.300 Nan END 880.0 12.365 Nan 156.145 nan 107.500 Nan END 1170.0 13.604 nan 180.769 nan 122.600 Nan END 1460.0 14.431 nan 180.894 nan 133.300 Nan END 1750.0 15.113 nan 188.864 nan 143.000 Nan END 2040.0 15.401 nan 189.875 nan 149.400 Nan END 2330.0 15.782 nan 195.568 nan 156.800 Nan END 2620.0 16.205 nan 197.232 nan 161.800 Nan END 2910.0 16.472 nan 197.533 nan 166.000 Nan T1 10.0 3.002 nan 46.000 nan 9.600 Nan T1 300.0 21.971 nan 530.295 nan 183.200 Nan T1 590.0 30.527 nan 711.144 nan 298.700 Nan T1 880.0 36.489 nan 867.020 nan 396.700 Nan T1 1170.0 40.668 nan 920.508 nan 470.400 Nan T1 1460.0 43.781 nan 955.694 nan 526.400 Nan T1 1750.0 47.321 nan 999.787 nan 591.300 Nan T1 2040.0 50.643 nan 1080.504 nan 649.500 Nan T1 2330.0 52.722 nan 1112.668 nan 686.400 Nan T1 2620.0 54.818 nan 1137.734 nan 730.700 Nan T1 2910.0 56.947 nan 1168.852 nan 765.900 Nan T2 10.0 2.997 nan 42.200 nan 9.600 Nan T2 300.0 22.157 nan 554.998 nan 185.100 Nan T2 590.0 29.238 nan 683.425 nan 295.700 Nan T2 880.0 34.967 nan 747.700 nan 380.100 Nan FCUP 97 Exploring Polar Microbiomes as Source of Bioactive Molecules

T2 1170.0 39.569 nan 883.258 nan 455.200 Nan T2 1460.0 42.517 nan 950.843 nan 514.500 Nan T2 1750.0 45.383 nan 967.974 nan 563.700 Nan T2 2040.0 47.782 nan 1044.965 nan 612.600 Nan T2 2330.0 50.193 nan 1051.678 nan 649.300 Nan T2 2620.0 52.498 nan 1127.759 nan 693.600 Nan T2 2910.0 54.580 nan 1152.546 nan 736.500 Nan T3 10.0 2.751 nan 30.200 nan 8.900 Nan T3 300.0 14.025 nan 207.037 nan 100.000 Nan T3 590.0 17.560 nan 252.208 nan 141.500 Nan T3 880.0 20.742 nan 314.902 nan 176.300 Nan T3 1170.0 22.311 nan 362.426 nan 207.100 Nan T3 1460.0 24.407 nan 399.013 nan 231.700 Nan T3 1750.0 25.318 nan 434.221 nan 252.700 Nan T3 2040.0 27.131 nan 459.640 nan 271.200 Nan T3 2330.0 27.473 nan 443.496 nan 285.400 Nan T3 2620.0 28.474 nan 476.726 nan 303.900 Nan T3 2910.0 29.248 nan 488.927 nan 314.300 Nan T4 10.0 2.451 nan 39.000 nan 9.500 Nan T4 300.0 15.911 nan 401.059 nan 148.700 Nan T4 590.0 21.690 nan 507.766 nan 235.400 Nan T4 880.0 25.897 nan 642.869 nan 302.300 Nan T4 1170.0 28.742 nan 790.901 nan 351.300 Nan T4 1460.0 31.379 nan 762.586 nan 399.200 Nan T4 1750.0 33.909 nan 861.324 nan 446.700 Nan T4 2040.0 35.862 nan 851.893 nan 471.900 Nan T4 2330.0 37.160 nan 922.084 nan 510.000 Nan T4 2620.0 39.173 nan 940.563 nan 544.300 Nan T4 2910.0 40.334 nan 940.289 nan 571.100 Nan T5 10.0 1.825 nan 23.200 nan 7.900 Nan T5 300.0 11.804 nan 230.962 nan 100.400 Nan T5 590.0 15.111 nan 316.505 nan 153.200 Nan T5 880.0 18.495 nan 387.380 nan 196.400 Nan T5 1170.0 19.819 nan 433.372 nan 227.700 Nan T5 1460.0 22.005 nan 503.986 nan 258.700 Nan T5 1750.0 23.796 nan 509.278 nan 284.400 Nan T5 2040.0 24.721 nan 569.334 nan 307.400 Nan T5 2330.0 25.895 nan 583.448 nan 329.300 Nan T5 2620.0 27.344 nan 607.428 nan 352.900 Nan T5 2910.0 28.111 nan 611.936 nan 364.900 Nan T6 10.0 2.763 nan 52.400 nan 9.800 Nan T6 300.0 19.841 nan 519.250 nan 178.800 Nan T6 590.0 28.004 nan 782.436 nan 294.100 Nan T6 880.0 32.555 nan 904.556 nan 375.200 Nan T6 1170.0 37.867 nan 1036.788 nan 455.500 Nan T6 1460.0 41.976 nan 1130.791 nan 524.300 Nan T6 1750.0 44.989 nan 1172.492 nan 578.500 Nan T6 2040.0 49.149 nan 1247.857 nan 635.700 Nan T6 2330.0 50.968 nan 1308.019 nan 688.600 Nan T6 2620.0 53.095 nan 1304.180 nan 726.300 Nan T6 2910.0 57.124 nan 1427.286 nan 784.800 Nan

FCUP 98 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table S 6 - Summary table of taxonomic frequency distributions at Genus level for Cyanobacteria. The genera with 0% of distribution were eliminated from the table.

Legen Taxonomy Tota END T1 T2 T3 T4 T5 T6 d l k__Bacteria;p__Cyanobacteria;c__;o__;f__;g__ 2.4 0.0% 16.4 0.1% 0.0% 0.1 0.0 0.0

% % % % % k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Chlorophyta;f__Chlamydomonadaceae;Other 0.0 0.0% 0.1% 0.0% 0.0% 0.0 0.0 0.0

% % % % k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Stramenopiles;f__;g__ 0.5 0.0% 0.2% 1.6% 0.4% 0.7 0.1 0.2

% % % % k__Bacteria;p__Cyanobacteria;c__ML635J-21;o__;f__;g__ 0.0 0.0% 0.0% 0.0% 0.0% 0.0 0.1 0.0

% % % % k__Bacteria;p__Cyanobacteria;c__Oscillatoriophycideae;o__Chroococcales;f__;g__ 1.2 0.0% 0.0% 0.4% 7.8% 0.0 0.0 0.0

% % % % k__Bacteria;p__Cyanobacteria;c__Oscillatoriophycideae;o__Oscillatoriales;f__Phormidiaceae;g__Phormidium 1.9 0.0% 1.3% 11.0 1.0% 0.0 0.0 0.0

% % % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Pseudanabaenales;f__Pseudanabaenaceae;g__ 1.7 0.4% 0.0% 0.6% 10.8 0.0 0.0 0.0

% % % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Pseudanabaenales;f__Pseudanabaenaceae;g__Lept 0.4 0.0% 2.5% 0.1% 0.0% 0.0 0.0 0.0

olyngbya % % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Pseudanabaenales;f__Pseudanabaenaceae;g__Pse 0.5 0.0% 2.5% 1.2% 0.0% 0.0 0.0 0.0

udanabaena % % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Synechococcales;f__Acaryochloridaceae;g__Acaryo 6.3 44.2 0.0% 0.0% 0.0% 0.0 0.0 0.0

chloris % % % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Synechococcales;f__Chamaesiphonaceae;g__ 0.0 0.0% 0.1% 0.0% 0.0% 0.0 0.0 0.0

% % % % k__Bacteria;p__Cyanobacteria;c__Synechococcophycideae;o__Synechococcales;f__Synechococcaceae;g__Synech 0.0 0.0% 0.0% 0.0% 0.0% 0.0 0.0 0.1

ococcus % % % %

FCUP 99 Exploring Polar Microbiomes as Source of Bioactive Molecules

Table S 7 - Summary table of taxonomic frequency distributions at Phylum level. Th genera with 0% of distribution were eliminated from the table.

Total END T1 T2 T3 T4 T5 T6 Legend Taxonomy % % % % % % % % k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__;g__ 4.5% 1.1% 1.3% 4.3% 1.9% 11.2% 3.7% 7.7% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__C111;g__ 2.1% 2.2% 3.1% 2.9% 2.9% 1.3% 0.2% 1.8% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__EB1017;g__ 0.2% 0.0% 1.0% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__Iamiaceae;g__Iamia 0.3% 0.0% 0.6% 0.8% 0.0% 0.4% 0.0% 0.1% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__JdFBGBact;g__ 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.4% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__Microthrixaceae;g__ 0.0% 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__koll13;g__ 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__Actinobacteria;c__Acidimicrobiia;o__Acidimicrobiales;f__ntu14;g__ 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;Other;Other 0.2% 0.0% 0.4% 0.0% 0.0% 0.0% 0.4% 0.2% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__;g__ 1.4% 1.0% 1.5% 1.7% 0.6% 1.9% 1.7% 1.5% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Frankiaceae;g__ 0.1% 0.3% 0.0% 0.0% 0.0% 0.0% 0.1% 0.0% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Intrasporangiaceae;Other 0.0% 0.0% 0.0% 0.0% 0.1% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Intrasporangiaceae;g__ 0.5% 2.1% 0.0% 0.1% 1.3% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Kineosporiaceae;g__ 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.2% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Microbacteriaceae;g__ 0.0% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Micrococcaceae;g__ 0.1% 0.4% 0.1% 0.1% 0.0% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Nocardioidaceae;Other 0.1% 0.0% 0.4% 0.3% 0.0% 0.1% 0.1% 0.1% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Nocardioidaceae;g__ 1.5% 0.0% 0.7% 1.3% 0.0% 2.5% 3.3% 2.4% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Nocardioidaceae;g__Aeromicrobium 0.1% 0.0% 0.1% 0.0% 0.0% 0.5% 0.0% 0.3% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Nocardioidaceae;g__Nocardioides 0.2% 0.0% 0.2% 0.4% 0.1% 0.4% 0.3% 0.2% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Pseudonocardiaceae;g__ 0.2% 0.0% 0.1% 0.6% 0.1% 0.1% 0.3% 0.2% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Sporichthyaceae;g__ 7.8% 0.4% 1.5% 3.6% 5.2% 3.5% 35.1% 5.5% k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Streptomycetaceae;g__ 0.1% 0.7% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__MB-A2-108;o__0319-7L14;f__;g__ 0.1% 0.0% 0.1% 0.4% 0.0% 0.3% 0.0% 0.2% k__Bacteria;p__Actinobacteria;c__Nitriliruptoria;o__Euzebyales;f__Euzebyaceae;g__Euzebya 3.0% 15.4% 0.0% 0.2% 3.9% 0.5% 0.6% 0.3% FCUP 100 Exploring Polar Microbiomes as Source of Bioactive Molecules

k__Bacteria;p__Actinobacteria;c__Nitriliruptoria;o__Nitriliruptorales;f__Nitriliruptoraceae;g__ 0.0% 0.0% 0.0% 0.0% 0.3% 0.0% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Rubrobacteria;o__Rubrobacterales;f__Rubrobacteraceae;g__Rubrobacter 1.7% 5.5% 0.1% 0.2% 0.1% 2.0% 1.7% 1.9% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Gaiellales;f__;g__ 0.0% 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 0.1% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Gaiellales;f__AK1AB1_02E;g__ 0.1% 0.0% 0.0% 0.1% 0.1% 0.3% 0.3% 0.3% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__ 1.2% 0.0% 1.7% 2.9% 0.0% 2.2% 0.2% 1.4% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;Other;Other 0.0% 0.0% 0.0% 0.1% 0.0% 0.1% 0.0% 0.0% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;f__;g__ 3.3% 0.5% 1.5% 3.0% 0.6% 11.5% 0.7% 5.6% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;f__Conexibacteraceae;g__ 0.1% 0.2% 0.1% 0.0% 0.0% 0.3% 0.1% 0.2% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;f__Patulibacteraceae;g__ 2.7% 0.0% 0.5% 0.8% 0.0% 6.5% 5.2% 6.2% k__Bacteria;p__Actinobacteria;c__Thermoleophilia;o__Solirubrobacterales;f__Solirubrobacteraceae;g__ 0.5% 0.0% 0.7% 0.3% 0.1% 1.0% 0.9% 0.8%