BACTERIAL NATURAL PRODUCT GENE

BIOMINING IN POLAR DESERT SOILS

A DISSERTATION SUBMITTED BY

NICOLE BENAUD

IN FULFILLMENT OF THE REQUIREMENTS FOR

THE DEGREE OF

DOCTOR OF PHILOSOPHY

SCHOOL OF BIOTECHNOLOGY AND BIOMOLECULAR SCIENCES

UNIVERSITY OF NEW SOUTH WALES, SYDNEY AUSTRALIA

SUPERVISOR: ASSOCIATE PROFESSOR BELINDA C. FERRARI

CO- SUPERVISOR: DR JOHN A. KALAITZIS

June, 2019

THESIS ABSTRACT

New antimicrobial agents are urgently required to address a global resistance crisis.

Natural products, biosynthesised through secondary metabolite pathways, remain at the forefront of drug discovery. Extreme environments are attractive targets for microbial biomining, due to their potential as reservoirs for novel metabolites. In polar regions, environmental conditions are some of Earth's most severe, and microbes dominate the biosphere. Moreover, arid polar soils comprise high relative abundances of and Proteobacteria, prolific producers of natural products. This research had three main objectives: to identify polar soil bacterial communities with novel biosynthetic potential; to establish a culture collection of Antarctic isolates with demonstrated bioactive capabilities; and to perform whole genome sequencing (WGS) on biotechnologically promising isolates for biosynthetic gene cluster (BGC) mining. Third generation long-read PacBio sequencing was employed to survey > 200 Antarctic and high Arctic soils for non-ribosomal peptide synthetase (NRPS) and (PKS) domain amplicons. Significant negative relationships were observed between natural product genes and soil fertility factors carbon, nitrogen and moisture. Sequences primarily aligned to domains encoding antifungal, antitumour and antimicrobial/surfactant compounds, but with low sequence similarity

(< 70%) to known genes. Using novel culturing approaches, 19 bacterial genera across 4 phyla were isolated from Antarctic soils, including 32 Actinomycetales . Extended oligotrophic incubation times were related to the recovery of novel and rare strains. In in situ antimicrobial assays, was the only genus to produce measurable activity. WGS was performed for 17 Antarctic isolates using PacBio technology. Genomes predominantly returned high-quality assemblies, and BGC analysis revealed an abundance of terpene,

i

NRPS, PKS, bacteriocin and siderophore clusters, with minimal gene similarity (< 70%) to known BGC. In accordance with amplicon sequencing results, many NRPS and PKS domains aligned most closely to antifungal, antitumour and antimicrobial/surfactant- encoding genes. These findings indicate that Antarctic desert soils are excellent candidates for novel natural product bioprospecting and gives further insight into the functional and ecological relevance of natural products in terms of competition between microbiota for scarce resources.

ii

PUBLICATIONS

Peer reviewed journal article

Benaud, N., Zhang, E., van Dorst, J., Brown, M.V., Kalaitzis, J.A., Neilan, B.A., Ferrari, B.C. (2019). Harnessing Long-Read Amplicon Sequencing to Uncover NRPS and Type I PKS Gene Sequence Diversity in Polar Desert Soils. FEMS Microbiology Ecology, doi.org/10.1093/femsec/fiz031

iii

TABLE OF CONTENTS

ABSTRACT...... i

PUBLICATIONS...... iii

TABLE OF CONTENTS...... iv

ACKNOWLEDGEMENTS...... x

LIST OF FIGURES...... xi

LIST OF TABLES...... xvi

ABBREVIATIONS...... xix

CHAPTER 1 1 INTRODUCTION...... …………...... 1 1.1 Antibiotic resistance drives the need for novel bioactive compounds...... 1 1.2 Microbial natural products...... 3 1.3 Natural product biosynthesis...... 6 1.3.1 Polyketide synthases (PKS)...... 8 1.3.1.1 Type I PKS...... 9 1.3.1.2 Type II PKS...... 13 1.3.1.3 Type III PKS...... 13 1.3.2 Non-ribosomal peptide synthetases (NRPS)...... 15 1.4 Dominant natural product-producing bacterial phyla...... 16 1.4.1 The Actinobacteria...... 17 1.4.2 The Proteobacteria...... 19 1.4.3 The Cyanobacteria...... 20 1.4.4 The Firmicutes...... 20 1.5 Microbial natural product diversity...... 21 1.6 Cold-adapted as a source of novel natural products...... 23 1.7 Polar terrestrial environments and their microbial diversity...... 29 1.8 Molecular technologies for natural products discovery...... 34 1.9 Thesis scope and aims……………...... 37

iv

CHAPTER 2 2 HARNESSING LONG-READ AMPLICON SEQUENCING TO UNCOVER NRPS AND TYPE I PKS GENE SEQUENCE DIVERSITY IN POLAR DESERT SOILS...... 40 2.1 INTRODUCTION…………………………...... 40 2.1.1 The polar deserts of East Antarctica and the High Arctic...... 40 2.1.2 Surveying polar desert soils for natural product genes...... 44 2.2 MATERIALS AND METHODS………………...... 45 2.2.1 Polar locations and soil collection...... 45 2.2.2 DNA extraction and 16S rDNA gene sequencing...... 47 2.2.3 Soil physical and chemical properties...... 48 2.2.4 PKS PCR amplification, gel extraction and barcoding...... 48 2.2.5 NRPS PCR amplification and barcoding...... 51 2.2.6 Natural product amplicon library preparation for SMRT sequencing…………………………………………………. 52 2.2.7 Processing PacBio SMRT sequencing data...... 52 2.2.8 Taxonomic classification of sequences using the BLAST database.. 53 2.2.9 Multivariate data analysis...... 53 2.2.10 Statistical analysis...... 54 2.2.11 Construction of phylogenetic trees...... 55 2.3 RESULTS………………………………...... 56 2.3.1 PKS and NRPS gene sequences compared across polar soils...... 56 2.3.2 PKS and NRPS biosynthetic diversity in polar soils...... 57 2.3.3 Classification and distribution of natural product gene cluster families…………………………………………………………….. 58 2.3.4 Phylogenetic analysis of NP domain sequences………………...... 64 2.3.5 Bacterial and Actinobacterial diversity of polar soils...... 67 2.3.6 Relationships between polar natural product genes, microbiomes and soil fertility parameters……………………………………….. 69 2.3.7 NP domain sequence novelty...... 75 2.4 DISCUSSION…………………...... 76

v

CHAPTER 3 3 CULTURING COLD ADAPTED BACTERIA FROM MAJOR NATURAL PRODUCT PRODUCING PHYLA USING NOVEL APPROACHES...... 80 3.1 INTRODUCTION…………………………...... 80 3.2 MATERIALS AND METHODS………………………...... 84 3.2.1 Site description and soil characteristics...... 84 3.2.1.1 Herring Island ...... 85 3.2.1.2 Mitchell Peninsula...... 86 3.2.1.3 Rookery Lake...... 87 3.2.1.4 Wilkes Tip...... 87 3.2.2 Direct soil culturing methods...... 88 3.2.2.1 Herring Island and Mitchell Peninsula DSC...... 88 3.2.2.2 Rookery Lake and Wilkes Tip DSC...... 90 3.2.2.3 Isolation and purification of bacteria from DSC...... 91 3.2.3 SSMS culturing at cold temperatures...... 91 3.2.3.1 Assessing microcolony growth and bacterial viability on the SSMS...... 93 3.2.3.2 Secondary cultivation of SSMS microcolonies using artificial media...... 96 3.2.3.3 Isolation and purification of bacteria from SSMS cultures...... 97 3.2.4 Gram and lactophenol cotton blue stain differentiation...... 97 3.2.5 Isolate DNA extraction and purification...... 98 3.2.6 PCR amplification and Sanger sequencing of isolate 16S rDNA genes...... 98 3.2.7 Cryopreservation of strains...... 100 3.2.8 Type I PKS and NRPS domain screening by PCR...... 100 3.2.9 In situ antimicrobial testing by cross-streak method...... 101 3.2.10 Type I PKS and NRPS domain screening and antimicrobial assays for strains isolated in previous studies...... 102

vi

3.2.11 Bacterial 16S rDNA gene analysis for pristine soils...... 103 3.2.12 Venn diagram visualisation of species shared between sites...... 104 3.2.13 Biotechnological and biosynthetic potential of isolates...... 104 3.3 RESULTS………………………………...... 104 3.3.1 Direct soil culturing...... 104 3.3.2 Cold-temperature SSMS cultures...... 112 3.3.3 Summary of bacterial isolates cultured by DSC and SSMS...... 116 3.3.3.1 Total bacteria cultured by all methods across four sites...... 116 3.3.3.2 Bacterial colony pigmentation...... 118 3.3.4 Natural product domain amplification and in situ antimicrobial activity for selected isolates...... 118 3.3.4.1 Strains isolated in this study...... 118 3.3.4.2 Strains isolated from previous studies...... 122 3.3.5 Selection of isolates for whole genome sequencing...... 124 3.4 DISCUSSION…………………...... 126

CHAPTER 4 4 ANTARCTIC BACTERIAL GENOMES HARBOUR A WEALTH OF UNCHARACTERISED BIOSYNTHETIC GENE CLUSTERS...... 131 4.1 INTRODUCTION…………………………...... 131 4.2 MATERIALS AND METHODS………………………...... 132 4.2.1 High molecular weight genomic DNA extractions………………... 132 4.2.1.1 Spore harvesting for Streptomyces and Kribbella isolates... 133 4.2.1.2 Modified Kirby method for Streptomyces and Kribbella genomic DNA extraction...... 134 4.2.1.3 Phenol-chloroform genomic DNA extraction for other genera...... 136 4.2.1.4 Quantification and quality assessment of genomic DNA...... 137 4.2.2 Multi-genome DNA library preparation and sequencing...... 138 4.2.3 De novo genome assembly from multi-genome libraries...... 139 4.2.3.1 Genome annotation, functional prediction and assessment of genome quality...... 140 vii

4.2.3.2 Phylogenetic analysis of genome-retrieved 16S rDNA genes...... 141 4.2.4 Secondary metabolite gene cluster analysis...... 141 4.2.4.1 AntiSMASH analysis for all Antarctic genomes...... 141 4.2.4.2 BLASTp and NaPDoS analysis of detected BGC domain sequences...... 142 4.3 RESULTS………………………………...... 143 4.3.1 Sequencing output and assembly of multi-genome libararies...... 143 4.3.2 Individual genome assemblies, annotation and quality assessment...144 4.3.3 Annotation and functional distribution of genes...... 146 4.3.4 Phylogenetic analysis based on 16S rDNA genes...... 150 4.3.4.1 Actinobacteria: Streptomyces group...... 150 4.3.4.2 Actinobacteria: non-Streptomyces group...... 152 4.3.4.3 Alphaproteobacteria group...... 152 4.3.4.4 Bacteroidetes: Hymenobacter...... 155 4.3.5 Biosynthetic gene clusters detected in Antarctic genomes...... 155 4.3.6 Biosynthetic gene cluster verification for Streptomyces, Kribbella and Azospirillum isolates...... 158 4.3.6.1 Streptomyces INR7 BGCs...... 158 4.3.6.2 Streptomyces NBH77 BGCs...... 162 4.3.6.3 Streptomyces NBSH44 BGCs...... 167 4.3.6.4 Kribbella SPB151 BGCs...... 170 4.3.6.5 Azospirillum INR13 BGCs...... 171 4.3.6.6 PKS and NRPS gene amino acid sequence similarity to known genomic regions...... 171 4.3.7 NaPDoS analysis of condensation and ketosynthase domains...... 174 4.4 DISCUSSION…………………...... 179

CHAPTER 5 5 DISCUSSION AND CONCLUSIONS...... 185 5.1 RESEARCH MOTIVATIONS AND OBJECTIVES...... 185 5.2 KEY FINDINGS...... 187 viii

5.2.1 Soil fertility is associated with natural product gene presence and diversity in polar desert soils...... 187 5.2.2 Bacterial adaptation to the Antarctic environment includes desiccation-, starvation- and radiation- resistance...... 188 5.2.3 Biosynthetic gene clusters in Antarctic bacteria highlight survival strategies...... 191 5.2.3.1 Carotenoids, siderophores and biosurfactants...... 192 5.2.3.2 Long-chain polyunsaturated fatty acids...... 194 5.2.4 Antarctic soil bacteria contain an abundance of uncharacterised biosynthetic domains...... 195 5.2.5 Eukaryotic cells are targeted by many of the predicted biosynthetic pathways...... 196 5.2.6 Long-read sequencing for natural product domain amplicon and genomic BGC analysis...... 197 5.3 FUTURE DIRECTIONS...... 198

REFERENCES……………………………………...... 202

APPENDICES...... 240

ix

ACKNOWLEDGMENTS

Firstly, I would like to thank my Supervisor, Belinda Ferrari, who devotes an incredible amount of time to her students, providing motivation, inspiration and support, as well as steering us in challenging directions. I couldn't have asked for a better supervisor. Other special thankyou's go to John Kalaitzis, Josie van Dorst, Mukan Ji, Mark Brown and Brett

Neilan, who have all shared their valuable time and expertise, and to Eden Zhang, who provided the NRPS domain sequencing and processing data for Chapter 2 and worked tirelessly and patiently alongside me to co-author the publication stemming from that research. For genome assembly and annotation, and a great deal of bioinformatics support for genome analysis in Chapter 4, I would like to thank Dr Richard Edwards and Timothy

Amos, UNSW. Also Brigid Betz-Stablein and Mark Tanaka from UNSW Sydney’s Stats

Central for statistical advice in Chapter 2. To Ferrari Lab team not already mentioned, thankyou to the most friendly, helpful and enthusiastic people; Sarita Pudasaini, Sally Crane,

Kate Montgomery, Angelique Ray, Sin Yin Wong, Carolina Gutiérrez-Chávez, Lauren

Williams, Jieyu Liu, Lucien Alperstein, Iskra Nicetic, Chengdong Zhang, and members of other labs who have helped me and shared their knowledge; Tim Williams and James

Charlesworth. I would very much like to thank the AAD’s expedition teams in 2005 and 2012 for sample collection, and Bioplatforms Australia who provided Vestfold Hills biodiversity data. For sequencing I would like to thank Tonia Russell, Dr Carolina Correa Ospina and Dr

Jackie Chan from The Ramaciotti Centre, UNSW. Finally, and most of all, I would like to thank my partner Tony for being so supportive, always encouraging me even when this journey has been extremely challenging for both of us at times, my friends, and my parents who are always there for me, providing love and financial support which has enabled this to be possible at all. Thankyou all.

x

LIST OF FIGURES

CHAPTER 1

Figure 1.1 Structural diversity within microbial natural product ...... 4

Figure 1.2 Examples of bacterial secondary metabolites...... 6

Figure 1.3 Biosynthesis of erythromycin A by the Type I PKS system,

DEBS...... 11

Figure 1.4 Schematic of Module 1 from the trans-AT PKS responsible for

virginiamycin biosynthesis...... 12

Figure 1.5 Schematic of actinorhodin biosynthesis by Type II PKS...... 14

Figure 1.6 Biosynthesis of vancomycin...... 16

Figure 1.7 Morphological examples of major antimicrobial-producing

bacterial phyla...... 17

Figure 1.8 Characteristic developmental cycle of Streptomyces species...... 18

Figure 1.9 Novel in situ cultivation techniques supply microbes with diffusible

nutrients from their natural environment...... 22

Figure 1.10 Under-explored and extreme environments are targets for novel

natural products discovery...... 24

Figure 1.11 Cold-adapted bacteria lower the melting temperature of their

membrane phospholipids, through modification of fatty acid

(FA) components...... 27

Figure 1.12 Map of Antarctica...... 29

Figure 1.13 The Arctic climate is moderately less extreme than the Antarctic and

supports more animal and vascular plant life...... 31

xi

Figure 1.14 Bacterial phylogenetic diversity of McMurdo Dry Valleys,

eastern Antarctica...... 32

CHAPTER 2

Figure 2.1 Maps of eastern Antarctica highlighting Windmill Islands and Vestfold

Hills regions...... 42

Figure 2.2 Map of the high Arctic, focussing on Ellesmere Island and

Svalbard...... 43

Figure 2.3 Geospatial transect sampling design...... 46

Figure 2.4 Capture of natural product diversity in polar soils...... 58

Figure 2.5 PKS domain sequence by genera and phyla, assigned

through BLASTx analysis...... 60

Figure 2.6 NRPS domain sequence taxonomy by genera and phyla, assigned

through BLASTx analysis...... 63

Figure 2.7 Phylogenetic relationship of PKS protein sequences with reference

bacteria based on BLASTx output...... 65

Figure 2.8 Phylogenetic relationship of NRPS protein sequences against

reference bacteria based on BLASTx output...... 66

Figure 2.9 Soil bacterial diversity observed from 16S amplicon sequencing of

soil from each of the 12 sites analysed...... 67

Figure 2.10 Actinobacterial diversity by Order and Family, observed from 16S

amplicon sequencing...... 68

Figure 2.11 Natural product gene amplification revealed significant relationships

with soil carbon, and dry matter fraction...... 70

Figure 2.12 Natural product gene association with soil fertility factors...... 72

xii

Figure 2.13 Natural product gene nMDS analysis...... 73

Figure 2.14 Bacterial community 16S rDNA gene analysis and measured soil

parameters show clustering similarities...... 74

Figure 2.15 Natural product domain sequence novelty when compared to known

secondary metabolite protein sequences for NRPS and PKS...... 75

CHAPTER 3

Figure 3.1 Under starvation conditions Myxococcales form conspicuous,

macroscopic fruiting bodies...... 81

Figure 3.2 Antarctic soils used for culturing were selected from three

pristine polar deserts and one human-impacted site...... 84

Figure 3.3 Bacterial 16S rDNA diversity for the three pristine samples cultured;

HI, MP and RL...... 86

Figure 3.4 Direct soil culturing using both E. coli lawn and cellulose baiting

methods on WCX agar plates...... 89

Figure 3.5 Direct soil culturing using the E. coli lawn method with the addition

of rabbit dung pellets...... 90

Figure 3.6 Principles of the soil substrate membrane system (SSMS)...... 92

Figure 3.7 Flowchart for bacterial cultivation using cold-incubated SSMS...... 94

Figure 3.8 Pattern of inoculation for cross-streak agar assay...... 102

Figure 3.9 Substrate mycelium-like filaments were observed by microscopy of

direct soil cultures...... 105

Figure 3.10 Visible colonies were directly picked from soil cultures using a

sterile toothpick and stereomicroscopy...... 106

Figure 3.11 Colony morphology for six different Streptomyces isolates...... 110

xiii

Figure 3.12 DSC isolates with ≤ 98% 16S rDNA gene sequence similarity to

known species...... 111

Figure 3.13 Cold-incubated SSMS microcolonies visualised using epi-fluorescence

microscopy...... 113

Figure 3.14 Bacteria cultured from HI by the cold-incubated SSMS...... 115

Figure 3.15 Cultured bacterial species recovered across four Antarctic soils by DSC

and the SSMS...... 117

Figure 3.16 Carotenoid-like pigmentation was observed in half of all cultured

isolates...... 118

Figure 3.17 Cross-streak antimicrobial assay for bacterial isolates...... 119

CHAPTER 4

Figure 4.1 Functional classification of protein-coding genes in Antarctic

bacterial genomes by abundance of Clusters of Orthologous

Groups (COGs)...... 148

Figure 4.2 Maximum likelihood phylogenetic tree of 16S rDNA gene for

Streptomyces Antarctic isolates...... 151

Figure 4.3 Maximum likelihood phylogenetic tree of 16S rDNA gene for

Antarctic isolates belonging to the Actinobacteria phylum

(excepting Streptomyces)...... 153

Figure 4.4 Maximum likelihood phylogenetic tree of 16S rDNA gene for

Antarctic isolates belonging to the Proteobacteria phylum...... 154

Figure 4.5 Maximum likelihood phylogenetic tree of 16S rDNA gene for

Bacteroidetes Antarctic isolate Hymenobacter NBH84...... 155

xiv

Figure 4.6 Biosynthetic gene clusters detected by AntiSMASH in Antarctic

bacterial genomes...... 157

Figure 4.7 Circular representation of the Streptomyces isolate INR7

genome...... 160

Figure 4.8 The Streptomyces INR7 genome contains an NRPS BGC, Region 25,

with 100% gene similarity to the tambromycin BGC...... 162

Figure 4.9 Circular representation of the Streptomyces isolate NBH77

genome...... 164

Figure 4.10 The Streptomyces isolate NBH77 NRPS-Type I PKS BGC,

Region 3...... 166

Figure 4.11 Circular representation of the Streptomyces isolate NBSH44

genome...... 168

Figure 4.12 The putative plasmid, contig 2, carried by Streptomyces

NBSH44...... 170

Figure 4.13 Circular representation of the Kribbella isolate SPB151 genome…. 172

Figure 4.14 The Azospirillum isolate INR13 harbours a potential polyunsaturated

fatty acid (PUFA) synthase cluster...... 174

Figure 4.15 Phylogenetic analysis of ketosynthase domains by maximum

likelihood method against NaPDoS database domains...... 177

Figure 4.16 Phylogenetic analysis of condensation domains by maximum

likelihood method against NaPDoS database domains...... 178

xv

LIST OF TABLES

CHAPTER 2

Table 2.1 Mean annual weather statistics for regions within eastern Antarctica

and the high Arctic...... 44

Table 2.2 PCR primers and conditions for amplification of PKS ketosynthase/

acyl transferase domains, and NRPS adenylation domains...... 50

Table 2.3 Analyses of relationship between natural product gene presence and

total carbon (TC), total nitrogen (TN) and dry matter

fraction (DMF)...... 71

CHAPTER 3

Table 3.1 Location and soil characteristics for selected Antarctic soils...... 85

Table 3.2 Primer sets employed for PCR targeting 16S bacterial rDNA, PKS and

NRPS domain fragments...... 99

Table 3.3 Strains isolated in previous studies which were screened for PKS and

NRPS domains and antimicrobial activity...... 103

Table 3.4 Phylogenetic distribution of bacterial species cultured from all sites

by DSC...... 108

Table 3.5 Phylogenetic distribution of isolates cultured from Herring Island by

cold-temperature SSMS...... 114

Table 3.6 SSMS followed by liquid media enrichment conditions for

recovered species...... 116

Table 3.7 Natural product domain amplification and in situ antimicrobial activity

for strains isolated in this study...... 120

xvi

Table 3.8 Natural product domain amplification and in-situ antimicrobial activity

for isolates from previous studies...... 123

Table 3.9 Characteristics used to select eighteen strains for whole

genome sequencing...... 125

CHAPTER 4

Table 4.1 Genomic DNA extraction methods for Antarctic bacteria...... 133

Table 4.2 Distribution of isolates within all three multi-genome DNA

libraries...... 139

Table 4.3 Sequencing output and assembly summaries for three multi-

genome libraries...... 144

Table 4.4 Antarctic bacterial genome assembly and quality assessments...... 145

Table 4.5 Predicted protein-coding sequences and CDS assigned to COGs...... 147

Table 4.6 Proportion of Antarctic bacterial genomes dedicated to secondary

metabolite biosynthetic clusters...... 156

APPENDICES...... 240

APPENDIX ONE (CHAPTER 2)...... 240

Appendix Table A1.1 Taxonomic classification of PKS KS/AT

domain sequences when analysed using both nucleotide BLASTn and

translated protein sequence BLASTx algorithms...... 240

Appendix Table A1.2 Taxonomic classification of NRPS AD

domain sequences when analysed using both nucleotide BLASTn and

translated protein sequence BLASTx algorithms...... 247

APPENDIX TWO (CHAPTER 3)...... 257 xvii

A2.1 STOCK SOLUTIONS...... 257

A2.2 MEDIA...... 258

APPENDIX THREE (CHAPTER 4)...... 260

A3.1 MEDIA...... 260

Appendix Table A3.1 Representative reference genomes chosen for

mapping to Antarctic bacterial libraries, with corresponding quality

measures determined by CheckM...... 261

Appendix Table A3.2 Biosynthetic gene clusters detected in

Antarctic bacterial genomes by antiSMASH...... 262

Appendix Table A3.3 NRPS gene BLASTp and NaPDos analysis

of condensation domains...... 268

Appendix Table A3.4 Type I PKS gene BLASTp and NaPDos

analysis of ketosynthase domains...... 273

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and

NaPDos analysis of condensation and ketosynthase domains...... 275

Appendix Table A3.6 Type II PKS gene BLASTp and NaPDos

analysis of ketosynthase domains...... 280

xviii

ABBREVIATIONS

AAD Australian Antarctic Division

AAP Aerobic anoxygenic phototroph

ACP Acyl carrier protein domain

AD Adenylation domain

AF Adams Flat

AFH Alexandra Fjord Highland

AMR Antimicrobial Resistance

AT Acyltransferase domain

A+T Adenine and thymine

ATCC American type culture collection

BGC Biosynthetic gene cluster

BP Browning Peninsula bp Base pairs

C Condensation domain

CHS Chalcone synthase

CRE Carbapenem resistant Enterobacteriaceae

CS Casey station d Day(s)

DEBS 6-deoxyerythronolide B synthase

DH Dehydratase domain

DMF Dry matter fraction

DMSO Dimethyl sulfoxide

DNA 2’deoxyribonucleic acid

xix

dNTP Deoxynucleotide triphosphate

E Epimerisation domain

EB Elution buffer

ER Enoylreductase domain

FA Fatty Acid

FAS Fatty acid synthase g G-Force g Gram

Gb Giga base pairs

G+C Guanine and cytosine h Hour(s)

HI Herring Island

HTS High-throughput sequencing

HV Heidemann Valley

ISP4 Inorganic salts starch agar kb Kilobase pairs km Kilometre

KR Ketoreductase domain

KS Ketosynthase domain

L Litre

LC-MS Liquid chromatography-mass spectrometry

LC-PUFA Long chain polyunsaturated acid

M Methylation domain

Mb Mega base pairs

xx

min Minute(s) mL Millilitre mm Millimetres

MP Mitchell Peninsula

MRSA Methicillin resistant

NA Nutrient agar ng Nanogram

NMR Nuclear magnetic resonance spectroscopy

NP Natural product

NRP Non-ribosomal peptide

NRPS Non-ribosomal peptide synthetase

OW Old Wallow

PBS Phosphate buffered saline

PCP Peptide carrier protein domain

PCR Polymerase chain reaction pmol Picomole

PK Polyketide

PKS Polyketide synthase

PUFA Polyunsaturated fatty acid

R Reduction domain rDNA Ribosomal deoxyribonucleic acid

RL Rookery Lake

RNA Ribonucleic acid

RR Robinson Ridge

xxi

RT Room temperature (~21°C) s Second(s)

SGS Second-generation sequencing

SM Secondary metabolite

SMRT Single Molecule Real-Time sp. Species spp. Species (plural)

TAE Tris-acetate-ethylenediaminetetraacetic acid

TC Total carbon

TE Thioesterase domain

TE buffer Tris-ethylenediaminetetraacetic acid

TGS Third-generation sequencing

TN Total nitrogen

µL Microlitre

µm Micrometre

µM Micromolar v/v volume/volume

WGS Whole genome sequencing

WT Wilkes Tip w/v weight/volume

xxii

xxiii

xxiv

CHAPTER ONE

1 INTRODUCTION

1 1.1 Antibiotic resistance drives the need for novel bioactive compounds

2 Antimicrobial resistance (AMR) has been described as one of the primary threats currently

3 facing human health. Predictions of a post-antibiotic era emerged following the discovery, in

4 2008, of a carbapenem-resistant Enterobacteriaceae (CRE) isolate, resistant to all clinically

5 useful drugs (Kumarasamy et al. 2010, WHO, 2015a). The dissemination of CRE holds

6 particular concerns because the Gram-negative Enterobacteria are a common cause of

7 infection, affecting approximately 140,000 people per year in the United States (US) alone.

8 Furthermore, carbapenem drugs are considered the last resort treatment (Ventola 2015, CDC,

9 2013). The mortality rate for those infected by CRE currently stands at 26-44% (Falagas et

10 al. 2014). In addition to CRE, other primary AMR concerns include: multi-drug resistant

11 , which accounts for an estimated 240,000 deaths per year worldwide (WHO,

12 2017); methicillin-resistant Staphylococcus aureus (MRSA), a common cause of nosocomial

13 and community-acquired infections, responsible for over 11,000 deaths per year in the US

14 (CDC, 2013); and the sexually transmitted disease Gonorrhoea, which annually affects over

15 78 million people worldwide, and whose multi-resistance now includes the last-line

16 cephalosporin’s (WHO, 2015b). Incidences of multi-drug resistant fungal infections such as

17 Candida are also on the increase, with limited therapeutic options (Arendrup & Patterson

18 2017). Overall, deaths resulting from resistant infections are double that for non-resistant

19 strains (WHO, 2014).

20

1

21 The development of pan-resistance has prompted the clinical return of drugs discontinued

22 due to poor safety profiling. For example, one of only two treatment options now available

23 for CRE infection is colistin, a polymyxin drug which was removed from human use due to

24 considerable nephrotoxicity (Li et al. 2006, MacNair et al. 2018). Ironically, throughout its

25 human treatment hiatus, colistin has remained in agricultural use, and colistin resistance has

26 already been detected, sourced to pig production (Liu et al. 2016, MacNair et al. 2018). This

27 development highlights the problematic relationship between medicine and agriculture in

28 terms of antibiotic stewardship.

29

30 Antibacterial resistance genes have been present in the environment long before the clinical

31 use of antibiotics (Abraham & Chain 1940, D’costa et al. 2011, Aminov & Mackie 2007),

32 with genes conferring resistance to modern clinical antimicrobials amplified from 30,000-

33 year-old permafrost sediment ice cores (D’costa et al. 2011). Resistance mechanisms are

34 thought to have evolved due to complex competitive relationships between microorganisms

35 and the environment (Aminov & Mackie 2007). However, while the development of AMR

36 is a naturally-occurring evolutionary phenomenon, selection pressures brought on by the

37 overuse and misuse of antibiotics both in health care and agriculture, have accelerated the

38 evolution of resistance (Kumarasamy et al. 2010, Ventola 2015). Additionally, the genes

39 conveying resistance may be easily shared amongst bacteria through horizontal gene transfer

40 (HGT), due to carriage on gene cassettes, integrons and plasmids (Liu et al. 2016, Aminov

41 & Mackie 2007, Baker et al. 2018).

42

43 The discovery of CRE's not only heralded a post-antibiotic era, but highlighted the

44 inadequacies in current antibiotic discovery efforts, particularly against Gram-negative

2

45 pathogens (CDC, 2013, Wyres & Holt 2018). For over a decade there has been a steady

46 decline in antibiotic research and development by pharmaceutical companies, due primarily

47 to economic factors (Baltz 2007, Bérdy 2005, Gilbert 2010). The number of companies

48 pursuing antibiotic research has fallen from 18 in 1990, to just 4 in 2013 (Butler et al. 2013).

49 Antibiotics are time-consuming and expensive to develop; taking 10-15 years, and costing

50 between $800 million and $1.7 billion US dollars to produce. What's more, in comparison

51 with drugs for chronic illness, the profit margin is low, due to short term treatment regimens

52 (Lobanovska & Pilla 2017, IDSA, 2004). This downsizing has had a noticeable impact on

53 the number of new drug approvals, with only two novel drugs approved in the five-year

54 period between 2008-2012, compared with 16 between 1983 and 1987 (CDC, 2013, IDSA,

55 2004). Recent government and health organisation incentives have resulted in a number of

56 fast-tracked approvals, leading to several new antibiotics approved by the FDA in 2014 and

57 2015 (FDA, 2014, FDA, 2015). Unfortunately, none of these new antibiotics address the

58 problem of CRE.

59

60 1.2 Microbial natural products

61 Most antibiotics share a natural product (NP) origin. They are small (< 3000 gmol-1),

62 bioactive molecules produced by microbes (bacteria, fungi, yeasts and slime moulds), plants,

63 and some animals (Bérdy 2005, Harvey et al. 2015, Donadio et al. 2007). NPs have a vast

64 range of known chemical structures, and their bioactivities are similarly broad, ranging from

65 antiviral, antibacterial, antifungal, insecticidal, antitumour, anticholesterol to antiparasitic

66 activities (Newman & Cragg 2012, Shen 2015).

3

Figure 1.1 Structural diversity within microbial natural product antibiotics. β- lactams (e.g. penicillin, cefotaxime, meropenem) contain 4-membered lactam rings.

Aminoglycosides (e.g. gentamicin) contain amino sugar moieties. Streptogramins (e.g. virginiamycin) and (e.g. erythromycin) both contain macrocyclic lactone rings. Tetracyclines contain four hydrocarbon rings. Ansamycins (e.g. rifamycin) are cyclic structures formed from aromatic moiety and aliphatic chain. Glycopeptides (e.g. vancomycin) and lipopeptides (e.g. daptomycin) are peptides with attached sugar or lipid moieties, respectively. Source: ChemACX database (2018). 4

67 Antimicrobial NP structural groups (Fig. 1.1) include the β-lactams (e.g. penicillin,

68 cephalosporins, carbapenems), glycopeptides (e.g. vancomycin, bleomycin), lipopeptides

69 (e.g. daptomycin), macrolides (e.g. erythromycin, pimaricin), ansamycins (e.g. rifamycin,

70 geldanamycin), tetracyclines (e.g. doxycycline, tetracycline), aminoglycosides (e.g.

71 streptomycin, gentamicin), and streptogramins (e.g. etamycin, virginiamycin) (Harvey et al.

72 2015, Bérdy 2005).

73

74 Despite advances in chemical synthesis methods, microbially-derived NPs remain at the

75 forefront of antibiotic drug discovery (Shen 2015, Harvey et al. 2015, Pye et al. 2017).

76 Combinatorial chemical synthesis, while initially showing enormous potential for

77 development of novel structures, has instead emphasised the major challenge of successful

78 drug design; bioactivity. Of the staggering > 4 million new organic chemicals designed, only

79 0.001% have gone on to become clinically useful drugs, compared to ~0.3% from NP sources

80 (Bérdy 2005, Newman & Cragg 2012). Unlike their synthetic counterparts, NPs have evolved

81 their biological activity. Perhaps it is therefore not surprising that they tend to show better

82 human bioavailability, attributable to similarities between microbial and mammalian

83 metabolites (Bérdy 2005, Harvey et al. 2015).

84

85 Over 23,000 bioactive microbial compounds have been uncovered thus far, with filamentous

86 bacteria and fungi amongst the most prolific producers of those used in medicine and industry

87 (Katz & Baltz 2016, Bérdy 2005). Together, they account for approximately 90% of all

88 clinical antibiotics. Bacteria from the Actinomycetales order (phylum Actinobacteria) are

89 particularly prolific, being responsible for around 65% of antibiotics currently on the market

90 (Bérdy 2005, Newman & Cragg 2012).

5

91 1.3 Natural product biosynthesis

Figure 1.2 Examples of bacterial secondary metabolites. Astaxanthin is a carotenoid

produced by many red-orange pigmented bacteria such as Paracoccus spp. (Tsubokura et

al. 1999). Desferrioxamine, a siderophore, is produced by a diverse range of bacteria

including all Streptomyces spp. (Barona-Gómez et al. 2004). The cyanobacterial toxin

cyanopeptolin contributes to water contamination during blooms of Cyanobacteria such

as Mycrocystis spp. (Faltermann et al. 2014). The lipopeptide biosurfactant and toxin,

syringomycin, is commonly produced by soil and rhizosphere bacteria such as

Pseudomonas spp. (Raaijmakers et al. 2010).

92 6

93

94 Microorganisms produce NPs through secondary metabolism pathways. While not essential

95 for the organism's viability, secondary metabolites are involved in a variety of chemical

96 processes and are hypothesised to confer an evolutionary advantage (Karlovsky 2008,

97 Maplestone et al. 1992). The full extent of their ecological role remains to be determined, but

98 NPs contribute to complex competitive and symbiotic interactions with other organisms

99 (Karlovsky 2008). NPs include toxins such as cyanopeptolin (Fig. 1.2); photoprotective

100 pigments, such as carotenoids and melanins; siderophores, such as desferrioxamines, which

101 facilitate the extraction of iron and other essential metals; and biosurfactants, such as

102 syringomycin (Fig. 1.2) (Fechtner et al. 2011, Raaijmakers et al. 2010).

103

104 Correlations have been reported between the length of microbial genomes and the carriage

105 of NP genes. In genomes under 2 Mb, NP genes appear rare or absent (Donadio et al. 2007,

106 Wang et al. 2014, Jenke-Kodama et al. 2005). Genome size is not the sole factor, however,

107 as they are also absent from certain large bacterial genomes (> 8 Mb). NP-encoding regions

108 are large, and their encoded products are energetically expensive to produce. Their

109 maintenance thus comes with high metabolic cost, suggesting an equally strong selective

110 pressure for their upkeep (Wang et al. 2014, Fischbach et al. 2008, Pickens et al. 2011). Genes

111 encoding NPs are typically arranged in clusters, and in prokaryotes, the genes’ transcription

112 is usually controlled by one operon (Pfeifer & Khosla 2001, Zhu et al. 2014). Genome

113 sequencing has shown that individual Actinomycetales may carry over 30 different

114 biosynthetic gene clusters (BGC). However, the majority are not expressed under general

115 laboratory conditions (Zaburannyi et al. 2014). For example, Streptomyces coelicolor is

116 known to produce five antibiotics, yet its genome reveals 29 predicted BGCs (Liu et al. 2013).

7

117

118 NPs are produced in bacteria through a variety of enzyme-catalysed pathways. The majority

119 are biosynthesised by two mega-enzyme systems: polyketide synthase (PKS), and non-

120 ribosomal peptide synthetase (NRPS) gene families, and hybrids of these systems (Donadio

121 et al. 2007, Newman et al. (2017)). Others include terpene, aminocoumarin, aminoglycoside,

122 nucleoside, alkaloid and ribosomal-peptide pathways (Medema et al. 2011).

123

124 1.3.1 Polyketide Synthases (PKS)

125 PKSs catalyse consecutive condensation reactions between small carboxylic acid derivatives,

126 in a process similar to fatty acid (FA) biosynthesis (Hertweck 2009, Weissman 2015b).

127 Acetyl-CoA commonly forms the starter unit, while extender units are usually malonyl-CoA

128 or methylmalonyl-CoA (Donadio et al. 2007). Although PKS clusters are generally similar

129 in basic structure, and display high genetic conservation of domains, their polyketide (PK)

130 products are a highly diverse group of chemicals, with a broad range of activities (Banik &

131 Brady 2010, Pfeifer & Khosla 2001). They include antibiotics, such as erythromycin;

132 immunosuppressants such as Sirolimus (rapamycin); anti-cholesterol drugs lovastatins; and

133 potent anticancer compounds, epothilones (Pfeifer & Khosla 2001, Katz & Baltz 2016).

134 Diversity is achieved through small variations on the basic building theme, plus additional

135 post-synthesis modifications (Moffitt & Neilan 2003, Hertweck 2009). Several types of PKS

136 are characterised, classified according to their resemblance to the type of fatty acid synthase

137 (FAS), from which they most certainly evolved (Jenke-Kodama et al. 2005). They are

138 designated Types I, II or III; but many also display hybrid functionality (Hertweck 2009,

139 Weissman 2015b).

140

8

141 1.3.1.1 Type I PKS

142 Type I PKS resemble Type I FAS, found in animals and fungi. They are the most versatile

143 and complex of the PKSs (Keatinge-clay 2012). Often, Type I PKS systems construct

144 complex ring molecules, comprising of 8 to 62 carbons, such as erythromycin (Fig.

145 1.3). The typical Type I PKS, also referred to as cis-AT, modular, or non-iterative PKS, are

146 constructed with multiple catalytic sites arranged along numerous modules, under the same

147 open reading frame (ORF) (Fig. 1.3) (Cuadrat et al. 2018, Davison et al. 2014). Entire BGCs

148 can be upwards of 150 kb in length and are usually high (> 70%) in guanine-cytosine (G+C)

149 content (Laureti et al. 2011, Peiru et al. 2009, Pfeifer & Khosla 2001). Each module is

150 responsible for one extension step during formation of the PK chain. The product is then

151 passed to the next module, in a process analogous to assembly line manufacture (Fischbach

152 & Walsh 2006). Modular, non-iterative Type I PKS are exemplified by the enzyme 6-

153 deoxyerythronolide B synthase (DEBS); which synthesises the macrolide antibiotic

154 erythromycin A (Fig. 1.3), originally isolated from Saccharopolyspora erythraea (Staunton

155 & Wilkinson 1997).

156

157 Since its discovery in the early 1990s, DEBS has been extensively studied (Pfeifer & Khosla

158 2001, Davison et al. 2014). Three core domains, an acyltransferase (AT), an acyl carrier

159 protein (ACP) and a ketosynthase (KS), exemplify a minimal module of Type I PKS (Fig.

160 1.3). The AT domain binds the chosen extender unit, transferring it to the ACP, forming a

161 thioester bond. The KS-domain then catalyses a decarboxylative condensation between this

162 extender unit and the growing intermediate PK bound to the ACP-domain of the previous

163 module (Fig. 1.3) (Cane 2010, Staunton & Wilkinson 1997).

9

164 The loading module of the DEBS PKS contains only AT and ACP sites, and the terminal

165 module contains only a thioesterase (TE) domain, responsible for the completion and release

166 of the PK (Cane 2010). Additional PKS domains can include β-ketoreductase (KR),

167 dehydratase (DH), enoylreductase (ER), and in some cases methylation (M) domains

168 (Donadio et al. 2007).

169

170 Recently, a variant of the Type I PKS system has been described which is thought to have

171 evolved independently; the trans-AT PKS (Helfrich & Piel 2016, Davison et al. 2014, Cheng

172 et al. 2003). While generally resembling cis-AT PKS modularity, trans-AT PKSs feature

173 important distinctions, primarily that the AT domains are discrete, free-standing proteins that

174 act iteratively (Fig. 1.4). Trans-AT PKSs may also exhibit unusual domain type and order,

175 duplicated and functionless domains, and modules split between two proteins. They are also

176 commonly found as hybrid systems incorporating NRPS (Helfrich & Piel 2016, Davison et

177 al. 2014, Wang et al. 2014).

10

Figure 1.3 Biosynthesis of erythromycin A by the Type I PKS system DEBS. The starter unit is propionyl-CoA and the extender units are six methylmalonyl-CoA. Modules involved in the production of 6-deoxyerythronolide include a loading module, six modules for polyketide extension, and a terminal TE which cyclises and releases the product. The loading module contains an AT and ACP, while chain elongation modules 1-6 minimally contain AT, ACP and KS domains. Additional domains are KR, DH and ER, and the

DEBS system comprises three large proteins (DEBS 1, 2 and 3). The final product, erythromycin A, is formed following several post-PKS modifications; two hydroxylations, two glycosylations and a methylation. Adapted from Davison et al.

(2014). 11

178

Figure 1.4 Schematic of Module 1 from

the trans-AT PKS responsible for

virginiamycin biosynthesis, which

includes > nine chain extension modules.

This system exhibits several typical

characteristics of the trans-AT PKS,

including a discrete, iteratively acting AT

domain and duplicated domains (e.g. ACP).

Adapted from Davison et al. (2014).

179

180 General predictions can be made about the chemical structure of unknown PKS products,

181 based on genes within the BGC and their arrangement. As trans-AT PKS lack the co-linearity

182 of cis-AT PKS, structural predictions are more difficult (Davison et al. 2014). For cis-AT

183 PKS, the length of the PKs can be estimated by the number of enzyme modules, and domains

184 within each module can predict the types of reactions the PKs have undergone (Donadio et

185 al. 2007). Remarkably, fragments of KS domain sequences alone can accurately predict BGC

186 end-products. This was demonstrated by Gontang et al. (2010), who PCR-amplified KS

187 domain sequences (~700 bp) with a high level of amino acid sequence similarity (≥ 85%) to

188 previously characterised tetronomycin BGCs, and subsequently confirmed tetronomycin

189 production in harbouring isolates (Gontang et al. 2010).

190

12

191 1.3.1.2 Type II PKS

192 Type II PKS are found almost exclusively in Actinobacteria, but resemble Type II FAS found

193 in plants and bacteria (Hertweck 2009, Das & Khosla 2009). They differ to Type I PKS in

194 that they are collections of separate enzymes, working iteratively with minimal domain

195 functions. Although the enzymes are discrete entities, in vivo it is thought they form

196 complexes similar to the Type I PKS enzymes (Hertweck 2009, Austin & Noel 2003).

197 Additionally, despite having acyl-transferase abilities, their AT-domain is absent. Uniquely

198 among the PKS enzymes, Type II PKS contain two coupled β-ketosynthase subunits: KS and

199 chain length factor (CLF), which form a heterodimer (Hertweck 2009, Kim & Yi 2012). Type

200 II PKS typically produce aromatic, polycyclic structures, such as the tetracyclines (Fig. 1.1).

201 Biosynthesis is exemplified by the production of the isochromanequinone antibiotic

202 actinorhodin by Streptomyces coelicolor, from eight malonyl precursors (Fig. 1.5) (Pfeifer &

203 Khosla 2001, Keatinge-Clay et al. 2004). Tailoring enzymes can include oxygenases and

204 glycosyl, amino- and methyl-transferases. Several subclasses are known, yet many

205 complexities of Type II PKS production remain to be determined (Kim & Yi 2012).

206

207 1.3.1.3 Type III PKS

208 The third type of PKS are often referred to as CHS-type PKS, due to their similarity to the

209 first and most well-known Type III PKS, chalcone synthase (CHS) discovered in plants

210 (Austin & Noel 2003, Shimizu et al. 2017). Type III PKS are not well investigated, with the

211 first bacterial Type III PKS not characterised until 1999. They have a simple architecture,

212 usually consisting of a self-contained homodimer of identical KS domains. Starter building

213 blocks may be ring or chain-type acyl-CoA units, such as benzoyl‐CoA or malonyl-CoA, and

13

Figure 1.5 Schematic of actinorhodin biosynthesis by Type II PKS. PK initiation and

elongation is catalysed by the coupled KS-CLF domains, which perform repetitive

decarboxylative condensations of malonyl precursors, delivered to them as thioesters by

ACP. Malonyl precursors are attached to the ACP by malonyl-CoA:ACP transacylase

(MCAT). Following elongation, the ACP delivers the chain to tailoring enzymes,

including KR, responsible for carbonyl group reduction; aromatases (ARO) and cyclases

(CYC), which catalyse regiospecific cyclisations of the chain. Adapted from Kim & Yi

(2012); and Das and Khosla (2009).

214

215 extender units are usually malonyl CoA (Shimizu et al. 2017). A single active site within

216 each KS domain catalyses PK synthesis through iterative priming, extension, and cyclisation

217 reactions. Other accessory tailoring enzymes perform downstream modifications such as

218 hydroxylation, acetylation, oxidation and methylation. Type III PKs often produce precursor

14

219 molecules for antibiotics, UV protection pigments, antimicrobial resistance enzymes and

220 alternative electron carriers (Austin & Noel 2003, Shimizu et al. 2017).

221

222 1.3.2 Non-ribosomal peptide synthetases (NRPS)

223 NRPS produce many medically useful bioactive compounds, including antibiotics such as

224 penicillin and vancomycin, immunosuppressants like cyclosporine, and anti-tumour drugs,

225 such as bleomycin (Roongsawang et al. 2005, Weissman 2015a). NRPS possess many

226 similarities to Type I PKS, and indeed they may form hybrid mega-enzymes which utilise

227 both peptide and ketide extension units (Weissman 2015a). Like cis-AT PKS, NRPS are

228 large, multi-functional, modular enzymes, which synthesise complex molecules via

229 oligomerisation of smaller building blocks. Here however, the precursors are amino acids or

230 hydroxy acids (Donadio et al. 2007, Weissman 2015a). Three core domains usually make up

231 the minimal NRPS module. In NRPS these are an adenylation (AD) domain, a peptide carrier

232 protein (PCP) and a condensation (C) domain (Fig. 1.6) (Donadio et al. 2007, Weissman

233 2015a). The AD-domain selects and activates the appropriate amino acid. The resultant

234 amino acyl adenylate is then transferred and bound to the PCP via a thioester bond. The C-

235 domain catalyses peptide bond formation between PCP bound adenylates, and the peptidyl

236 intermediate bound to the preceding module’s PCP. In NRPS there is similarly a loading

237 module, containing only the AD- and PCP-domains, and a cyclising and terminating TE

238 domain. Additional domains can include epimerization (E), which convert L-amino acids

239 into the required D-amino acids; reduction (R) and methylation (M) domains (Donadio et al.

240 2007, Roongsawang et al. 2005). A typical NRPS cluster is exemplified by the vancomycin

241 glycopeptide antibiotic pathway, first isolated from Amycolatopsis orientalis (Fig. 1.6).

242 15

Figure 1.6 Biosynthesis of vancomycin. While some mechanisms of the pathway are still

unclear, components of the vancomycin backbone are assembled by an NRPS with three

subunits: VpsA, VpsB and VpsC. The seven modules contain domains with AD, PCP, C,

E and TE functions. Module 7 contains a domain X, for which function remains unknown.

Other tailoring steps include crosslinking by OxyB, OxyA and OxyC P450-like enzymes,

chlorine substitutions and glycosylation. Adapted from Shmartz et al. (2014).

243

244 1.4 Dominant natural product-producing bacterial phyla

245 Within the bacterial kingdom, NPs are disproportionately biosynthesised by four phyla:

246 Actinobacteria, Proteobacteria, Cyanobacteria and Firmicutes (Fig. 1.7) (Donadio et al. 2007,

247 Wang et al. 2014).

248

16

Figure 1.7 Morphological examples of major antimicrobial-producing bacterial

phyla. (A) Actinobacteria: Streptomyces sp. isolated from Antarctic soil. (B)

Proteobacteria: Myxococcus xanthus. Source Velicer & Yu (2003). (C) Cyanobacteria:

Nostoc flagelliforme. Source Feng et al. (2012). (D) Firmicutes: Bacillus sp. isolated from

Antarctic soil.

249

250 1.4.1 The Actinobacteria

251 Actinobacteria form a large proportion of soil microbial biomass, typically measuring 106 to

252 109 cells per gram of soil, and along with fungi are the primary decomposers of organic matter

253 (Barka et al. 2016, Babalola et al. 2009, Sun et al. 2017). The Streptomyces species, order

254 Actinomycetales (Fig. 1.7A), are the most abundant producers of antibiotics (Bérdy 2005,

255 Watve et al. 2001). However, the Actinobacteria phylum is large, and producers are also 17

256 found amongst Mycobacterium, Arthrobacter, Rhodococcus, and Nocardia spp., as well as

257 ‘rare’ Actinobacterial genera, defined as non-Streptomyces that are infrequently brought into

258 culture (Tiwari & Gupta 2013, Subramani & Aalbersberg 2013). While not necessarily scarce

259 in the environment, the rarer taxa are more fastidious, and include Saccharopolyspora,

260 Micromonospora, Streptosporangium, Actinomadura, Streptoverticillium, Kribbella and

261 Actinoplanes (Bérdy 2005, Subramani & Aalbersberg 2013, Lazzarini et al. 2000).

262

Figure 1.8 Characteristic developmental cycle of Streptomyces species. On

germination, spores swell and give rise to a germ tube, which develops into substrate and

aerial mycelium, comprised of multiple threadlike hyphae. Pre-spore compartments

differentiate into spores at the ends of the aerial hypha. The majority of NP biosynthesis

coincides with aerial hyphae growth and spore germination developmental stages.

Adapted from Flärdh & Buttner (2009).

18

263 Many of the Actinomycetales, such as Streptomyces spp., have morphological similarities to

264 fungi, including substrate and aerial mycelium formed from networks of hyphae; and spores,

265 which are resilient to environmental stressors (Flärdh & Buttner 2009, Olano et al. 2014).

266 The Streptomyces lifecycle is initiated by spore germination. Chains of cells branch out,

267 forming substrate mycelium, followed by aerial mycelium. From the aerial hyphae, spores

268 differentiate from apical compartments (Fig. 1.8) (Flärdh & Buttner 2009, Olano et al. 2014).

269 Research shows that the majority of NP expression coincides with the development of aerial

270 hyphae and the stationary phase of growth, in response to nutrient deficiency (Liu et al. 2013,

271 Čihák et al. 2017). NP are also produced during germination, where they are hypothesised to

272 function as signalling molecules, and suppressors of competitor microorganisms (Čihák et

273 al. 2017).

274

275 1.4.2 The Proteobacteria

276 The large and diverse Gram-negative phylum Proteobacteria, is divided into five classes:

277 Alpha- (α), Beta- (β), Gamma- (γ), Delta- (δ) and Epsilon- (ε) proteobacteria. Many genera

278 are well known pathogens, including Escherichia, Pseudomonas, Bordetella,

279 Campylobacter, Neisseria, Legionella, Salmonella and Yersinia genera (Kersters et al. 2006,

280 Gupta 2000).

281

282 Antibiotic producers are found in a diversity of Proteobacterial classes, including

283 Gammaproteobacterial genera such as Pseudomonas and Lysobacter, and family

284 Vibrionaceae (Bérdy 2005, Xie et al. 2012, Mansson et al. 2011). Family Myxobacteria

285 (Deltaproteobacteria) are prolific NP-producers, and have been the source of some unusual

19

286 compounds, such as antitumour drugs epothilones from Sorangium cellulosum, and

287 saframycin from Myxococcus xanthus (Fig. 1.7B) (Wenzel & Müller 2007).

288

289 1.4.3 The Cyanobacteria

290 Cyanobacteria are phototrophic prokaryotes which can produce energy and oxygen through

291 photosynthesis. Morphologically, they are a diverse group, ranging from single-cell forms to

292 large multicellular filaments (Calteau et al. 2014). Cyanobacteria produce a range of

293 bioactive NPs, including antimicrobial and antitumour compounds, and are notorious for

294 producing dangerous toxins such as microcystin when proliferating in water habitats (Calteau

295 et al. 2014, Faltermann et al. 2014). Important NP producing genera within the Cyanobacteria

296 phylum include Anabaena and Nostoc spp. (Fig. 1.7C) (Burja et al. 2001).

297

298 1.4.4 The Firmicutes

299 All Gram-positive bacteria were once grouped under phylum Firmicutes, including the

300 Actinobacteria. The groupings have since been re-designated, whereby the high G+C

301 Actinobacteria and the low G+C Firmicutes form distinct phyla. Firmicutes members

302 comprise endospore-forming genera such as Bacillus (Fig. 1.7D) and Clostridium, notable

303 human pathogens like Streptococcus and Staphylococcus, and the pleomorphic genus

304 Mycoplasma (Galperin 2013). Bacillus and Paenibacillus spp. produce a range of NPs, and

305 are common producers of antibiotic biosurfactant lipopeptides (e.g. iturin, fengycin and

306 surfactin), produced through NRPS pathways (Sansinenea & Ortiz 2011, Bérdy 2005, Sumi

307 et al. 2015).

308

20

309 1.5 Microbial natural product diversity

310 Despite over 70 years of NP research, the diversity of natural chemistry is far from exhausted.

311 Modelling has estimated < 3% of all Streptomyces antibiotics have been uncovered thus far

312 (Bérdy 2005, Watve et al. 2001, Harvey et al. 2015, Clardy et al. 2006). Technological

313 limitations continue to impede discovery of novel compounds: the majority of

314 microorganisms are recalcitrant to cultivation, many isolates do not express antibiotic

315 compounds under standard laboratory conditions, continual re-discovery of known

316 compounds wastes time and resources, and the screening process remains slow and labour

317 intensive, despite attempts at creating more high-throughput technologies (Harvey et al.

318 2015, Palazzolo et al. 2017).

319

320 Conventional methods of cultivation are believed to recover less than 1% of soil

321 microorganisms (Ferrari et al. 2008, Lewis 2013). Fastidious and rare organisms, particularly

322 from the chemically rich Actinobacteria, are expected to harbour a vast untapped source of

323 novel compounds, as NPs recovered from rare Actinobacteria have included unique

324 molecules, such as the medically important aminoglycoside, gentamicin (Fig. 1.1), produced

325 by Micromonospora spp. (Bérdy 2005, Lewis 2013). Thus, new culturing approaches are

326 warranted.

327

328 Over the last decade, novel in situ culturing approaches have improved the recovery of

329 recalcitrant microorganisms, including candidate divisions (Ferrari et al. 2005, Kaeberlein et

330 al. 2002, Nichols et al. 2010a). The strategies similarly exploit extended incubation

331 timeframes, combined with oligotrophic substrates supplying microbes with diffusible

332 substances direct from their natural environment. For example, the soil substrate membrane 21

333 system (SSMS), developed by Ferrari et al. (2005), cultivates microorganisms on the surface

334 of a thin polycarbonate membrane, separated from the originating soil sample by a semi-

335 permeable membrane (Fig. 1.9A).

336

Figure 1.9 Novel in situ cultivation techniques supply microbes with diffusible

nutrients from their natural environment. Examples include (A) SSMS (Ferrari et al.

2008), (B) the iChip (Source: https://news.northeastern.edu/2017/03/21/researcher-

develops-technology-to-advance-antibiotic-discovery/, and (C) the diffusion chamber.

(Source: http://blogs.jcvi.org/2014/08/trapping-microbes-750-miles-north-of-the-arctic-

circle/).

337

338 Nichols et al. (2010a) devised the isolation chip (iChip), a plastic plate containing multiple

339 through-holes, which can be dipped in a microbial/agar suspension. The compartmentalised

340 design provides a high-throughput culturing option as it allows for the isolation of single

341 bacterial cells, and thus pure cultivation in a single step. Following inoculation, semi-

342 permeable membranes are fixed to either side of the agar with external plates, and the iChip

343 is incubated in the natural environment (Fig. 1.9B). Another earlier diffusion chamber

344 method, designed by Kaeberlein et al. (2002) (Fig. 1.9C), was designed for incubation in

22

345 aquatic environments and sediments. Like the iChip, microbes in the diffusion chamber are

346 suspended in a layer of agar and sandwiched between semipermeable membranes.

347

348 In terms of antibiotic development, novel culturing techniques have borne fruit, with the

349 recent discovery of a novel antibiotic from a newly isolated Betaproteobacterial species,

350 Eleftheria terrae, captured using the iChip (Ling et al. 2015). The antibiotic, teixobactin, is

351 a new class of cell wall biosynthesis inhibitor, with activity against Gram-positive pathogens

352 including multi-drug resistant S. aureus. Importantly its mode of action appears to defy the

353 development of resistance over time (Ling et al. 2015). Unfortunately, while the iChip is

354 high-throughput, it is still labour intensive, as indicated by the pathway to teixobactin

355 discovery, which required the screening of fermentation extracts from ~10,000 iChip

356 cultivated bacteria (Nichols et al. 2010a, Ling et al. 2015). Interestingly, the iChip method

357 did not recover Actinobacteria such as Streptomyces spp., as successfully as Proteobacteria

358 and Firmicutes (Nichols et al. 2010a). Nevertheless, in situ cultivation techniques such as the

359 iChip have allowed for increased recovery of bacterial species, with up to 50% of inoculated

360 bacteria forming colonies. With domestication onto standard agar plates enhanced through

361 several rounds of cultivation, the iChip has expanded the success of antibiotic recovery from

362 novel species (Nichols et al. 2010a).

363

364 1.6 Cold-adapted bacteria as a source of novel natural products

365 In the search for new bacterial species and novel NP compounds, the focus is increasingly

366 turning to under-explored and extreme environments, including hydrothermal vents, caves,

367 rainforests, deserts, oceans and polar regions (Fig. 1.10) (Dhakal et al. 2017, Lazzarini et al.

368 2000, Bérdy 2005). These environments are under-explored due to limited accessibility and 23

369 pose unique survival challenges for resident microbiota. As a result, they are known to

370 harbour unique

371

Figure 1.10 Under-explored and extreme environments are targets for novel natural

products discovery. They include (A) caves, (B) polar regions, (C) hyperthermal vents,

(D) oceans, (E) deserts and (F) rainforests.

(Image sources: (A) J. Spies, http://yourshot.nationalgeographic.com | (B) C. Anthony,

NSF, https://photolibrary.usap.gov | (C) NSF/NOAA, https://www.pmel.noaa.gov/ | (D)

J. Reed http://dorsrv1.fau.edu/ | (E) J-C. Latombe, http://ai.stanford.edu/ | (F) F.

Fakhrurrazi https://www.nationalgeographic.com).

372

373 microorganisms, whose metabolic pathways yield valuable new enzymes and metabolites (Ji

374 et al. 2017, Terpe 2013, de Pascale et al. 2012). For example, biotechnologically vital 24

375 polymerases, such as thermostable high-fidelity pfu, have been isolated from

376 hyperthermophiles such as Pyrococcus furiosus, which grows optimally at 100°C (Lundberg

377 et al. 1991), while cold-adapted organisms are a source of industrially useful antifreeze

378 proteins, such as Afp1, isolated from the psychrophilic yeast Glaciozyma antarctica (Hashim

379 et al. 2013).

380

381 The Earth is primarily classified as a cold environment. Collectively, polar and alpine areas,

382 deep oceans, subterranean caves, and the upper atmosphere represent an estimated 85% of

383 the biosphere and maintain a temperature of 5°C or less. About 90% of Earth’s oceans, and

384 26% of land regions are consistently ≤ 5°C (Margesin & Miteva 2011). A large proportion

385 of Earth's microbial diversity is therefore cold-adapted, classified as either psychrophiles or

386 psychrotrophs according to optimal and maximum growth temperatures when in culture

387 (Morita 1975). The accepted definition of a psychrophile is a bacterium or archaeum with an

388 optimal growth temperature of 15°C or less, and a maximum growth temperature of 20°C.

389 Conversely, psychrotrophs are those which grow at low temperatures but whose optimal

390 growth temperature is > 15°C, with a maximum of 30°C (Chintalapati 2004, Blanc et al.

391 2012, Morita 1975). Thus far, true psychrophiles have been cultured from environments

392 which remain consistently at or below 4°C, such as oceans, sea ice, and sediments (Bowman

393 et al. 2005). Permafrosts and polar soils typically harbour psychrotrophic rather than

394 psychrophilic microbial isolates (Morita 1975, De Maayer et al. 2014), though questions

395 remain regarding the appropriateness of sample collection, storage and culturing techniques

396 for the successful capture of psychrophiles from these environments (Morita 1975, De

397 Maayer et al. 2014, Soina et al. 2004).

398

25

399 Unicellular microorganisms lack the physiology to regulate their temperature, therefore cold-

400 adapted microorganisms have evolved various structural and functional mechanisms to

401 enable survival and metabolism at sub-zero temperatures (Casanueva et al. 2010). Cold–

402 adapted bacteria possess modified proteins, amino acids, and cell wall and membrane

403 components (Margesin & Miteva 2011, Nikrad et al. 2016). For example, as well as

404 antifreeze proteins, microbes adapted to survive in extremely cold conditions produce unique

405 lipids and biosurfactants (Janek et al. 2010, Gentile et al. 2003). Dormancy is also a survival

406 strategy, achieved through formation of spores, and cyst-like resting cells which display

407 capsularised, thickened cell walls resistant to a range of environmental stressors (Soina et al.

408 2004). These resting forms exhibit strongly reduced respiration and metabolism

409 (Blagodatskaya & Kuzyakov 2013). Further, it has been suggested that energy sourced

410 through the scavenging of atmospheric trace hydrogen plays a significant role in their

411 persistence (Blagodatskaya & Kuzyakov 2013, Greening et al. 2015). Previous researchers

412 have examined whether microorganisms in sub-zero environments are metabolically active.

413 The lowest recorded temperature of activity thus far has been by Panikov and colleagues,

414 who measured metabolic respiration occurring in an Arctic soil community at a remarkable

415 −39°C (Panikov et al. 2006).

416

417 Arguably the most significant adaptation in cold-adapted microorganisms are changes to the

418 cell membrane. Referred to as homeoviscous adaptation, the process involves maintenance

419 of the membrane bilayer and viscosity via alteration of FAs incorporated into the membrane

26

Figure 1.11 Cold-adapted bacteria lower the melting temperature of their membrane

phospholipids, through modification of fatty acid (FA) components. Changes include

a decrease in saturated FAs such as hexadecanoic acid, and increases in methyl-branched

and unsaturated FAs, such as anteiso-heptadecanoic acid, and cis-9-hexadecanoic acid. A

small group of microorganisms synthesise and insert long chain polyunsaturated FA

(PUFA), such as docosahexaenoic acid (DHA) into their membrane (Kralova 2017,

Chintalapati 2004).

420

27

421 (Ernst et al. 2016, Kralova 2017). Changes typically include a proportional increase in

422 methyl-branched and unsaturated FA, with increased cis-configuration. Incorporation of

423 these types of FAs decreases the melting temperature of membrane phospholipids,

424 maintaining membrane fluidity and nutrient transport at cold temperatures (Fig. 1.11)

425 (Okuyama et al. 2007, Bianchi et al. 2014, Nichols et al. 1993, Kralova 2017). Some

426 psychrophilic bacteria incorporate long-chain polyunsaturated fatty acids (LC-PUFA), such

427 as docosahexaenoic acid (DHA) (Fig. 1.11).

428

429 The synthesis of unusual FAs in cold-adapted bacteria bears relevance to antibiotic

430 bioprospecting in polar regions. The evolutionary relatedness of PKS and FAS, and their

431 sharing of precursors, suggests that cold-adapted microorganisms harbouring atypical FAs

432 may likewise have evolved unique PKs.

28

433 1.7 Polar terrestrial environments and their microbial diversity

434

Figure 1.12 Map of Antarctica. Only 0.36% of the Antarctic continent is ice-free, and

coastal ice-free areas support the majority of occupied research stations (a selection of

USA, UK and Australian stations are shown in red). East and West Antarctica are divided

by the Transantarctic mountains. Adapted from USGS (2008).

435

29

436 Antarctica is one of Earth’s harshest environments. Of its 13,661,000 km², only 0.36% of the

437 continent is ice-free (Fig. 1.12) (Babalola et al. 2009, Ji et al. 2017). The Antarctic continent

438 is described as the “highest, driest, windiest and coldest” place on Earth (Yergeau et al. 2012),

439 with temperatures ranging from a record minimum of -89°C, to a summer maximum of

440 +10°C, and a yearly average of -10°C in coastal areas to -60°C inland (Scambos et al. 2018,

441 Cary et al. 2010). In comparison to Antarctica, the regional Arctic climate is less extreme,

442 and as such supports a greater diversity of animal and vascular plant life (Fig. 1.13) (Williams

443 et al. 2017). For example, in the high Arctic archipelago of Svalbard, Norway, a minimum

444 record temperature of -46°C, and maximum of +21°C have been experienced, with a yearly

445 average of around −5°C (Piskozub 2017, NMI, 2017).

446

447 Polar soils comprise an upper 'active' layer, covering a deeper layer of permafrost; ground

448 which retains a temperature ≤ 0°C for a minimum of two years consecutively (Janet &

449 Neslihan 2014, Stewart et al. 2012, Makhalanyane et al. 2015b). In both the Arctic and

450 Antarctica, active layer soils are subjected to alternating long seasonal periods of darkness

451 and high UV radiation, and regular freeze-thaw cycles (Obbels et al. 2016, Cary et al. 2010,

452 Stomeo et al. 2012).

453

454 Molecular surveys have revealed a much greater microbial richness in Arctic desert soils

455 when compared with Antarctica. For example, Ferrari et al. (2015) found up to 6-fold higher

456 microbial richness in high Arctic desert regions compared with eastern Antarctica. In eastern

457 Antarctic soils, bacteria dominate, with eukaryotic and archaeal richness an estimated 17-

458 fold and 40-fold lower respectively (Zhang et al. 2019, Ji et al. 2017, Ferrari et al. 2015).

459 Furthermore, fungal groups make up only 10% of eukaryotic diversity (Zhang et al. 2019),

30

460 compared with ~48% in Arctic tundra (Shi et al. 2015). These differences have been

461 attributed to increased Arctic soil fertility, and co-presence of vegetation, insect and animal

462 life (Ferrari et al. 2015, Siciliano et al. 2014). Regardless, despite being markedly low in

463 moisture and nutrients, Antarctic soils harbour surprisingly diverse bacterial communities,

464 spanning over 60 phyla (Cary et al. 2010, Ferrari et al. 2015, Ganzert et al. 2011, Zhang et

465 al. 2019).

466

Figure 1.13 The Arctic climate is moderately less extreme than the Antarctic and

supports more animal and vascular plant life. (A) The barren landscape of Adams Flat,

in the Vestfold Hills region of eastern Antarctica. (B) Vestpynten, Svalbard in the high

Arctic, showing the presence of vegetation. Photographs courtesy of AAD.

467

468 Permafrosts and polar desert soils have consistently revealed a high proportion of

469 Actinobacteria and Proteobacteria, while other dominant phyla include Acidobacteria,

470 Bacteroidetes, Chloroflexi, Gemmatimonadetes Deinococcus-Thermus and Cyanobacteria

31

471 (Ferrari et al. 2015, Ji et al. 2016, Cary et al. 2010, Jansson & Taş 2014, Yergeau et al. 2007,

472 Aislabie et al. 2008). For example, in the McMurdo Dry Valleys, Proteobacteria dominated

473 molecular studies with 23% abundance, followed by Actinobacteria (20%) (Fig. 1.14) (Cary

474 et al. 2010).

475

Figure 1.14 Bacterial phylogenetic diversity of McMurdo Dry Valleys, eastern

Antarctica. Screening of 16S rDNA gene bacterial diversity shows that terrestrial

Antarctic is typically comprised of a high abundance of Actinobacteria and Proteobacteria,

with similar phylogenetic profiles to those seen here. Adapted from Cary et al. (2010).

476

477 While Actinobacteria and Proteobacteria are environmentally ubiquitous, they appear well-

478 adapted to life at Earth’s poles. The high relative abundance of these prolific NP-producers 32

479 in polar regions increases their attractiveness as targets for bioprospecting (Núñez-Pons &

480 Avila 2015). Furthermore, phylogenetic studies have confirmed that the majority of

481 Actinobacteria uncovered by Antarctic molecular studies remain to be cultured (Babalola et

482 al. 2009).

483

484 Previously, microbial research efforts in Antarctica have focused mainly on ice-free coastal

485 areas in proximity to occupied research stations, such as Victoria land near McMurdo station,

486 and the maritime region of the Antarctic Peninsula to the west of the continent (Fig. 1.12)

487 (Pulschen et al. 2017, Chong et al. 2012). Consequently, aquatic samples and marine

488 sediments are the focus of most studies, while desert soil microbial studies remain rare,

489 particularly for eastern Antarctica (Wilkins et al. 2013, Zhu et al. 2015, Nichols et al. 1999).

490 Bacteria cultured from terrestrial Antarctica thus far show a dominance of Actinobacteria,

491 Proteobacteria, Firmicutes and Bacteroidetes phyla (Smith et al. 2006, Cary et al. 2010,

492 Zdanowski et al. 2013, Pudasaini et al. 2017, Chong et al. 2015).

493

494 In terms of NPs discovery, gene screening and bioactivity surveys on Antarctic isolates have

495 been modest in number and scale. However, bacteria isolated from sediments, soils, penguin

496 rookeries, permafrosts and glacial waters have been screened for PKS and NRPS gene

497 amplicons and/or antimicrobial activity (Shekh et al. 2011, Zhao et al. 2011, Zhao et al. 2008,

498 Encheva-Malinova et al. 2014, Yi Pan et al. 2013, Gesheva 2010, Silva et al. 2018, Lee et al.

499 2012). For example, Zhao et al. (2011, 2008) used molecular techniques to analyse Antarctic

500 coastal sediments for Type I PKS and NRPS genes. Analysis showed genes with closest

501 homology primarily to members of Cyanobacteria, Firmicutes and Proteobacteria. The

502 sequences exhibited low sequence similarity (~50-80%) to known gene sequences (Zhao, et

33

503 al., 2008; Zhao, et al., 2011). In antimicrobial assays, cultured Antarctic bacteria have

504 displayed activity predominantly against Gram-positive genera (e.g. Staphylococcus and

505 Bacillus) and fungi (e.g. Candida), including some multi-resistant strains (Gesheva 2010,

506 Shekh et al. 2011, Lee et al. 2012).

507

508 1.8 Molecular technologies for natural products discovery

509 An unsolved question remains whether harsh polar landscapes are worthy targets for NP

510 biomining compared with mesophilic soils. Perhaps microorganisms in polar deserts, at much

511 lower numbers than their temperate soil counterparts, have not been required to evolve the

512 same chemical competitive advantages, but instead gain greater fitness through physiological

513 changes. Conversely, with so few resources available to share, competitive advantages

514 provided through secondary metabolism may be a key to survival.

515

516 Molecular technologies such as high-throughput sequencing (HTS) offer a means to answer

517 this question, via whole genome and metagenome mining for BGC clusters, as well as

518 amplicon screening of NP genes from environmental DNA. HTS technologies, while not

519 without limitations, have vastly improved our understanding of microbial ecology, and

520 provide a more accurate estimation of diversity by taking into account the uncultured

521 majority (Caporaso et al. 2011, Hugenholtz et al. 2016, van Dijk et al. 2018). For example,

522 in several recent studies, Charlop-Powers et al. (2014, 2015, 2016), used HTS platforms

523 Roche 454 and Illumina to assess PKS and NRPS amplicons from geographically and

524 chemically diverse soils throughout the USA, Asia, Africa, Hawaii, Australia and the

525 Dominican Republic. Soil types included temperate and alpine forests, rainforests, hot

526 deserts, coastal sediments and urban parkland. Their results suggest arid soils present the 34

527 greatest biosynthetic potential, and that a population bias toward NP-rich phyla such as

528 Actinobacteria in these soil types contributes to greater PKS and NRPS richness (Charlop-

529 Powers et al. 2014, Charlop-Powers et al. 2015). More recently, molecular techniques were

530 used to survey PKS and NRPS genes in soil bacterial communities from diverse locations

531 including the Antarctic Peninsula. The authors found that Antarctic soils harboured endemic

532 NP sequences with low similarity to known compound sequences (Borsetto et al. 2019).

533

534 HTS platforms can be broadly characterised into short- or long-read technologies, and the

535 selection of a specific platform involves inevitable trade-offs between cost, coverage,

536 accuracy, and resolution (Goodwin et al. 2016). Short-read, second generation sequencing

537 (SGS) technologies (~50-300 bp), are exemplified by Illumina, who dominate the field by

538 providing instruments of the highest throughput and accuracy at the lowest cost (e.g. MiSeq,

539 HiSeq) (Goodwin et al. 2016, Levy & Myers 2016, Sedlazeck et al. 2018). For Illumina

540 instruments, limitations include an under-representation of A+T-rich and G+C-rich regions,

541 and difficulty in resolving long repetitive regions and structural variations, which become

542 particularly apparent in de novo genome and metagenome assembly (Goodwin et al. 2016,

543 van Dijk et al. 2018, Chen et al. 2013). Repetitive elements can comprise up to 10% of a

544 bacterial genome, spanning lengths far greater than that achievable by short-read

545 technologies. This inevitably leads to fragmentation, misassembles and genome gaps which

546 are difficult or impossible to resolve (Goodwin et al. 2016, Levy & Myers 2016, Miller et al.

547 2017). Importantly, essential and functional genes have been missed, such as ribosomal RNA

548 operons and transposons, discovered through direct comparison with long-read assemblies

549 (Driscoll et al. 2017, Hoefler et al. 2013).

550

35

551 These challenges are similarly relevant to NP discovery efforts. BGCs, including those of

552 PKS and NRPS, regularly span long (~20 kb) contiguous genomic regions, are rich in G+C

553 content, and, due to the highly-conserved and modular nature of the clusters, are repetitious

554 (Miller et al. 2017, Laureti et al. 2011, Gomez-Escribano et al. 2016, Nakano et al. 2017).

555 Fragmentation can be deleterious to the accurate annotation of these biotechnologically

556 important gene clusters (Goldstein et al. 2019, Hoefler et al. 2013). BGCs are examined by

557 NP chemists to make predictions about substrate selection, chemical structure,

558 stereochemistry, mechanisms of action and binding targets (Miller et al. 2017, Donadio et al.

559 2007). For cryptic pathways, where heterologous expression may present the most attractive

560 approach to large-scale compound production, accurate resolution of the complete BGC

561 sequences is fundamental (Miller et al. 2017).

562

563 Third-generation sequencing (TGS) long-read platforms, such as Pacific Biosciences

564 (PacBio) single-molecule real-time (SMRT) (e.g. RS II, Sequel) and Oxford Nanopore

565 Technologies (ONT) (e.g. MinION) instruments are capable of resolving regions unable to

566 be determined by short-read sequencing by producing 8 kb to > 1 Mb reads, at the expense

567 of lower throughput, higher error rate and cost (Goodwin et al. 2016, Levy & Myers 2016,

568 Sedlazeck et al. 2018, Payne et al. 2018). For long-read technologies, the high error rate,

569 occurring most commonly as random indels in raw data at a frequency as high as 15% for

570 SMRT, and 30% for nanopore, is the primary weakness (van Dijk et al. 2018, Goodwin et al.

571 2016). Fortunately, the stochastic nature of the errors enables correction through repeated

572 sequencing of single molecules, which is a feature of SMRT, but not currently for nanopore

573 (van Dijk et al. 2018), or repeated coverage of the same genomic region which can then

574 undergo consensus polishing. For SMRT sequencing, a high consensus accuracy of

36

575 ~99.999% can be achieved with coverage ~30 x (Goodwin et al. 2016, Levy & Myers 2016,

576 Nakano et al. 2017).

577

578 1.9 Thesis scope and aims

579 Terrestrial Antarctica is one of the most extreme habitats on Earth and remains under-

580 explored in terms of microbial and chemical diversity. Antarctic soils are dominated by

581 Actinobacteria and Proteobacteria, phyla proven to be rich sources of bioactive metabolites.

582 Cold-adaption in microorganisms is facilitated by modifications with relevance to secondary

583 metabolism, such as the biosynthesis of unique FAs, which are evolutionarily related to

584 secondary metabolites, PKs. A handful of small-scale studies have demonstrated

585 antimicrobial potential in polar bacterial isolates, but little is known regarding the

586 biosynthetic potential of east Antarctic desert soil communities. Recently, an analysis of

587 antimicrobial-associated genes across a range of soil types has suggested that arid soils offer

588 the greatest biosynthetic potential. We therefore hypothesised that the extremely limiting

589 environmental conditions of polar deserts may provide a rich source of unique NP genes and

590 compounds, with bioactivities that may contribute to the success and survival of the dominant

591 phyla in these regions, the Actinobacteria and Proteobacteria.

592

593 We aimed to determine the novel NP capacity of cold-adapted bacteria from under-

594 investigated regions of eastern Antarctica and the high Arctic, using both culture-independent

595 methods harnessing the latest in HTS technology, as well as novel culture-dependent

596 approaches. The primary objectives of this research were: (1) to identify polar soil

597 communities with novel biosynthetic potential by conducting a first-of-its-kind, large scale

598 investigation into NP-encoding genes in polar desert soils, targeting bacterial PKS and 37

599 NRPS-encoding genes, (2) to establish a culture collection of Antarctic isolates with

600 demonstrated bioactive capabilities, cultured using novel approaches targeting the

601 Actinobacteria and Proteobacteria phyla from soils exhibiting diverse and novel NP genes,

602 (3) to perform whole genome sequencing (WGS) on a number of the most promising isolates

603 with antimicrobial activity, NP genes, or other biotechnological value such as pigmentation,

604 and to conduct a deep investigation to uncover their novel natural product potential through

605 biosynthetic gene cluster (BGC) mining.

38

39

CHAPTER TWO

2 HARNESSING LONG-READ AMPLICON SEQUENCING

TO UNCOVER NRPS AND TYPE I PKS GENE

SEQUENCE DIVERSITY IN POLAR DESERT SOILS

This Chapter has been published as:

Benaud, N., Zhang, E., van Dorst, J., Brown, M.V., Kalaitzis, J.A., Neilan, B.A., Ferrari,

B.C. (2019). Harnessing Long-Read Amplicon Sequencing to Uncover NRPS and Type I

PKS Gene Sequence Diversity in Polar Desert Soils. FEMS Microbiology Ecology.

doi.org/10.1093/femsec/fiz031.

1 2.1 INTRODUCTION

2 2.1.1 The polar deserts of East Antarctica and the High Arctic

3 The majority of ice-free regions in Antarctica (Fig. 2.1A) and the high Arctic (Fig. 2.2A) are

4 classified as polar deserts, collectively spanning approximately 5 million km2 of terrestrial

5 Earth (Barry & Hall-McKim 2018, Goryachkin et al. 1999). Annual precipitation in these

6 systems compares to that of dry deserts, such as the Sahara and Gobi (Campbell & Claridge

7 1987b). The bioavailability of water is further restricted as precipitation falls mainly as snow

8 (Genthon et al. 2018, Campbell & Claridge 1987b, Lesins et al. 2010). In addition, very low

9 seasonal atmospheric humidity contributes to remarkably low surface soil water content, with

10 water availability reported as one of the most important variables influencing the activity and 40

11 distribution of polar desert biota, along with low soil nutrients (Stomeo et al. 2012, Obbels

12 et al. 2016, Campbell & Claridge 1987a). Polar deserts are largely devoid of vegetation,

13 which is driven by sub-zero temperatures and high aridity (Fig 1.13). In turn, carbon, nitrogen

14 and phosphorous are strongly limited, leading to extremely reduced soil biodiversity,

15 particularly in Antarctica (Siciliano et al. 2014, Cary et al. 2010, Campbell & Claridge 1987a,

16 Maestre et al. 2015).

17

18 Eastern Antarctica is home to several permanently occupied Australian research facilities

19 including Casey and Davis stations (Fig. 2.1). Casey station is situated in the Windmill

20 Islands region of Wilkes Land (Fig. 2.1B), an ice-free oasis of low lying (< 110 m) islands

21 and five major peninsulas (Clark, Bailey, Mitchell, Robinson Ridge and Browning

22 Peninsulas) (Goodwin 1993, Melick et al. 1994). Davis station is situated 1400 km from

23 Casey, in the Vestfold Hills, Princess Elizabeth Land (Fig. 2.1C). The low lying (< 160m)

24 Vestfold Hills region features three main peninsulas (Mule, Broad and Long Peninsulas), and

25 numerous sea-inlets and lakes (Kiernan & McConnell 2001, Verleyen et al. 2011, Seppelt et

26 al. 1988). Regional weather conditions in the Vestfold Hills are slightly milder on average

27 than those of the Windmill Islands (Table 2.1), although both experience low annual

28 precipitation (< 200 mm) (Campbell & Claridge 1987b, Seppelt et al. 1988, Melick et al.

29 1994).

41

Figure 2.1 Maps of eastern Antarctica highlighting Windmill Islands and Vestfold

Hills regions. (A) Antarctic continent showing location of Vestfold Hills and Windmill

Islands regions. (B) Windmill Islands region showing the location of Casey Station (CS),

Mitchell Peninsula (MP), Robinson Ridge (RR), Herring Island (HI) and Browning

Peninsula (BP) (C) The Vestfold Hills, depicting Davis station, Adam’s Flat (AF),

Heidemann Valley (HV), Old Wallow (OW) and Rookery Lake (RL). Maps adapted from

AADC (2017), photographs of MP and AF sampling sites courtesy of AAD and Tom

Mooney.

30

42

Figure 2.2 Map of the high Arctic, focussing on Ellesmere Island and Svalbard. Polar deserts are typically located > 75° N. (A) The Arctic circle, highlighting Canada,

Greenland and Svalbard. (B) Alexandra Fjord Highlands (AFH), Ellesmere Island,

Canada. (C) Spitsbergen, Svalbard, Norway, showing locations of Skjæringa (SS) and

Vestpynten (SV). Maps adapted from England et al. (2000); NPI (2016) and UT Libraries

(2009). Photograph of Alexandra Fjord by Katriina O’Kane http://arcticjournal.ca/featured/alexandra-fiord-a-high-arctic-oasis/, photograph of SV sampling sites courtesy of AAD.

43

31 Table 2.1 Mean annual weather statistics for regions within eastern Antarctica and the

32 high Arctic.

EAST ANTARCTICA HIGH ARCTIC Mean annual weather statistics Windmill Is. Vestfold Hills Ellesmere Is. Svalbard Precipitation (mm) < 175 < 200 < 200 < 270 Temperature range (°C) -41 to +9 -40 to +13 -41 to +9 -24 to +21 Wind speed (km/h) > 54 20 11 20 (Sources: Melick et al. 1994, Campbell & Claridge 1987b, Lévesque 1997, Lesins et al. 2010, Stewart et al. 2011, Rayback 2006, Isaksen et al. 2016, Hansen et al. 2014, NMI, 2017, Seppelt et al. 1988).

33

34 In the high Arctic (Fig. 2.2), desert soils are typically found above 75° N, in parts of Canada,

35 Norway, Alaska, Greenland and Russia (Goryachkin et al. 1999, Tedrow 2004, Barry & Hall-

36 McKim 2018). Two such polar deserts include Alexandra Fjord highlands, situated on Johan

37 Peninsula, Ellesmere Island, Nunavut, Canada (Fig. 2.2B), and Longyearbyen, on the

38 Norwegian archipelago of Svalbard (Fig. 2.2C), the latter of which boasts the world’s

39 northernmost human-populated township (Isaksen et al. 2016, Hansen et al. 2014, Stewart et

40 al. 2011). Climate change is impacting the Arctic at a greater rate than the world average,

41 with Svalbard experiencing an increase in annual mean winter temperature of 4.6°C since the

42 1990s. Summer temperatures in Longyearbyen can reach upwards of +21°C (Table 2.1)

43 (Isaksen et al. 2016, Hansen et al. 2014).

44

45 2.1.2 Surveying polar desert soils for natural product genes

46 In this chapter, we aimed to elucidate the diversity of NP-encoding genes present in an

47 extensive collection of desert soils from Antarctica and the high Arctic, for which the NP

48 diversity is unknown. We hypothesised that the unique environmental challenges faced by

44

49 the microbiota in these regions could select for novel NP genes and compounds, and may

50 contribute to the success and survival of the dominant phyla, the Actinobacteria and

51 Proteobacteria. Thus, we were particularly interested in the level of sequence novelty

52 compared with known NP genes, and the bioactivities of predicted compounds encoded by

53 NP genes harboured by resident bacteria. Further, to assist future bioprospecting efforts, we

54 aimed to identify specific polar soils with the greatest novel NP discovery potential.

55

56 We chose to employ the power of the third generation, long-read SMRT sequencing platform

57 PacBio RS II for this analysis. Short read (200-400 bp) amplicon sequencing technologies

58 have been employed to profile biosynthetic genes in temperate, hot desert and high altitude

59 soil biomes, as well as sponge microbiomes (Charlop-Powers et al. 2014, Charlop-Powers et

60 al. 2016, Aleti et al. 2017, Woodhouse et al. 2013, Borchert et al. 2016). We proposed that

61 longer read lengths would enable capture of entire gene amplicons produced by commonly

62 employed degenerate primer sets, which target long regions of conserved NRPS AD-domain

63 (~700 bp) (Fig 1.6), and PKS KS/AT-domain sequences (~1200-1400 bp) (Fig. 1.3) (Ayuso-

64 Sacido & Genilloud 2005, Owen et al. 2013, Peng et al. 2018). Thus, taxonomic resolution

65 would be enhanced, particularly at the protein level, to assist in functional predictions.

66

67 2.2 MATERIALS AND METHODS

68 2.2.1 Polar locations and soil collection

69 Soils were sampled from twelve polar locations, encompassing nine eastern Antarctic and

70 three high Arctic sites (Figs. 2.1 & 2.2). Five Antarctic sites were from the Windmill Islands

71 region; Mitchell Peninsula (MP) (66°19’S, 110°32’E), Robinson Ridge (RR) (66°22’S,

72 110°35’E), Browning Peninsula (BP) (66°28’S, 110°33’E), Herring Island (HI) (66°25’S, 45

73 110°39’E) and Casey station (CS) (66°17’S 110°32’E) (Fig. 2.1B), and four from the

74 Vestfold Hills, near Davis station; Adams Flat (AF) (68°33'S, 78°1'E), Heidemann Valley

75 (HV) (68°35'S, 78°0'E), Rookery Lake (RL) (68°30'S, 78°7'E) and Old Wallow (OW)

76 (68°36'S, 77°57'E) (Fig. 2.1C). High Arctic sites comprised Alexandra Fjord Highland

77 (Canada) (AFH) (78°52’N, 75°54’W) (Fig. 2.2B), and two from Svalbard (Norway);

78 Spitsbergen Longyearbyen Skjæringa (SS) (78°14’N, 15°30’W), and Spitsbergen

79 Longyearbyen Vestpynten (SV) (78°14’N, 15°20W) (Fig. 2.2C).

80

Figure 2.3 Geospatial transect sampling design (not to scale). Soils were sampled along

three 300m parallel transects, and analysed at distance points 0, 2, 100, 102, 200 and 202

m from all 12 polar locations, excepting CS (0, 2, 100, 102, 105, 110 m). Photograph of

BP courtesy of AAD, figure adapted from Zhang (2016).

81

82 Soils were sampled by the Australian Antarctic Division (AAD) during the summer months

83 2005 and 2012, using a spatially explicit design (Fig. 2.3) (Siciliano et al. 2014, van Dorst et

84 al. 2014, Ferrari et al. 2015). At each site, samples were taken from the top 10 cm of soil

85 along three parallel transects, situated 2 m apart and 300 m in length (Fig. 2.3). Here, 18

46

86 samples were selected per site, collected at distance points 0, 2, 100, 102, 200 and 202 m

87 along each of the three transects, except for CS which were taken at 0, 2, 100, 102, 105, 110

88 m distances, totalling 216 samples.

89

90 2.2.2 DNA extraction and 16S rDNA gene sequencing

91 DNA extraction and 16S rDNA sequencing and analysis were performed previously, as part

92 of larger biodiversity studies, described in detail in van Dorst et al (2014), Siciliano et al.

93 (2014), Bissett et al. (2016) and Ferrari et al. (2015). Briefly, DNA was extracted from 300

94 mg each soil sample, in triplicate using the FastDNA SPIN kit for soil (MP Biomedicals,

95 NSW, Australia) (Siciliano et al. 2014, van Dorst et al. 2014, Ferrari et al. 2015). DNA was

96 quantified with the Quant-iT Picogreen dsDNA Assay kit (Life Technologies, VIC,

97 Australia) and stored at -80oC until further use. For the Windmill Island and high Arctic sites,

98 bacterial 16S rDNA gene fragments were amplified using the primer set 27F and 519R and

99 sequenced using the 454 FLX titanium platform (Siciliano et al. 2014, van Dorst et al. 2014).

100 For the Vestfold Hills sites sequencing was performed on the lllumina MiSeq platform

101 (Bissett et al. 2016). Operational taxonomic units (OTUs) were clustered using ≥ 97 %

102 similarity and taxonomy assigned using the Green Genes database (Bissett et al. 2016, Ferrari

103 et al. 2015).

104

105 Here, analysis of 16S bacterial relative abundance at phyla level, as well as Actinobacteria at

106 order and family levels were performed, and visualised as stacked barcharts in R 3.4.0 using

107 the ggplot2 package v2.2-1 (Wickham 2011).

108

47

109 2.2.3 Soil physical and chemical properties

110 Fifty physical and chemical parameters were collected by standard methods, and are

111 described in detail in Siciliano et al. (2014) and Bissett et al. (2016). Properties included

112 slope, aspect and elevation, pH, conductivity, dry matter fraction (DMF), soil particle size,

- 113 total phosphorous (TP), total carbon (TC), total nitrogen (TN); water extractable ions NO2 ,

- - 3- 2- + 114 Br , NO3 , PO4 , SO4 , and NH4 ; and major elemental concentrations such as SiO2, TiO2,

115 Al2O3, Fe2O3, MnO, MgO, CaO, Na2O, K2O, P2O5, SO3 and Cl (Siciliano et al. 2014, Bissett

116 et al. 2016).

117

118 Soil physical and chemical data obtained for all sites were transformed and normalised for

119 further analysis. Skewed variables such as TN, TC, S, Na, Zn, Ca, Mg were log transformed,

120 while CaO, MgO, Fe2O3 and TP were square root transformed. Three missing TC values

121 were estimated by the EM algorithm (Clarke & Gorley 2015).

122

123 2.2.4 PKS PCR amplification, gel extraction and barcoding

124 To attach adaptors and barcodes, and increase yield, three rounds of PCR were employed for

125 PKS tag sequencing. As our soils contained high relative abundances of Actinobacteria, we

126 selected the published primers K1F and M6R (Table 2.2), previously designed and reported

127 for Actinomycetales by Ayuso-Sacido and Genilloud (2005). First-round PCR employed the

128 primers K1F/M6R and was performed under a touchdown thermocycler program to optimise

129 primer specificity (Table 2.2). Optimised reaction mixtures comprised 10 µL 5X Q5 Buffer

130 (NEB, Massachusetts), 10 µL 5x Q5 High G+C enhancer (NEB), 1.5 mM MgCl2, 0.2 mM

131 each dNTP, 16.25 µL water, 0.5 µM each primer (K1F, M6R), 1 unit of Q5 Hotstart High

132 Fidelity DNA Polymerase (NEB), and 5 µL of 1:10 dilution of DNA template (~5–10 ng/µL). 48

133 PCR products were visualised on 2% (w/v) agarose gel, and target amplicons (~1200-1400

134 bp length) extracted using the Zymoclean™ Gel DNA Recovery kit (Zymo Research,

135 California). Target amplicons were quantified using the NanoDrop 1000 Spectrophotometer

136 (Thermo Scientific, NSW, Australia), then used as templates in second-round PCR, to attach

137 adaptors. Primers K1F/M6R were modified to include a 5’ block and SMRT universal primer

138 (UP) adaptors (UPF-K1F/ UPR-M6R) (Table 2.2). Optimised reactions contained 10 µL 5X

139 Q5 Buffer, 10 µL 5X Q5 High G+C enhancer, 1.5 mM MgCl2, 0.2 mM each dNTP, 18.875

140 µL water, 0.5 µM each primer (UPF-K1F, UPR-M6R), 1 unit of Q5 Hotstart High Fidelity

141 DNA Polymerase, and approximately 10 ng of product as template, under optimal

142 thermocycler conditions (Table 2.2). Gel extracted target amplicons were quantified using

143 Quant-iT Picogreen dsDNA Assay kit (Life Technologies, VIC, Australia).

49

Table 2.2 PCR primers and conditions for amplification of PKS ketosynthase/ acyl transferase domains, and NRPS adenylation domains.

PCR Primer Primer Primer Sequence (5’-3’) Thermocycler Conditions Round Name Ref PKS Touchdown: 98°C 1 min, 10 cycles [98°C 1 K1F TSAAGTCSAACATCGGBCA Ayuso- 30 s, 60°C - 50°C 30 sec, decreasing by Sacido & 1°C each cycle, 72°C 40 s], 20 cycles Genilloud M6R CGCAGGTTSCSGTACCAGTA [98°C 30 s, 50°C 30 s, 72°C 40 s], 72°C 2 (2005) min UPF- /5AmMC6/-GCAGTCGAACATG Ayuso- 2 K1F TAGCTGACTCAGGTCAC-K1F Sacido & 98°C 1 min, 25 cycles [98°C 30 s, 65°C 30 Genilloud

UPR- /5AmMC6/-TGGATCACTTGTG s, 72°C 50 s], 72°C 2 min (2005),

M6R CAAGCATCACATCGTAG-M6R PacBio (2015)

3 B-UPF lbc#-UPF 98°C 30 s, 20 cycles [98°C for 10 s, 60°C PacBio B-UPR lbc#-UPR 30 s, 72°C 50 s], 72°C 2 min (2015) NRPS /5AmMC6/- Ayuso- UPF- 1 GCAGTCGAACATGTAGCTGACTCAGG Sacido & A3F TCACGCSTACSYSATSTACACSTCSGG 98°C 3 min, 35 cycles [98°C 20 s, 65°C 30 Genilloud /5AmMC6/- s, 72°C 30 s], 72°C 3 min (2005), UPR- TGGATCACTTGTGCAAGCATC PacBio A7R ACATCGTAGSASGTCVCCSGTSCGGTAS (2015)

2 B-UPF lbc#-UPF 98°C 3 min, 25 cycles [98°C 20 s, 71.3°C PacBio 30 s, 72°C 30 s], 72°C 3 min (2015) B-UPR lbc#-UPR 144 A3F/ A7R primer sequences indicated in bold

50

145 Third-round barcoding PCR was performed using PacBio supplied 96-well plates containing

146 unique barcoded (B) SMRT universal primer (UP) sets (B-UP F/R), under optimal

147 thermocycler conditions (Table 2.2). PKS samples were run in duplicate. Reactions

148 comprised 5 µL 5X Q5 Buffer, 5 µL 5X Q5 High G+C enhancer, 1.5 mM MgCl2, 0.2 mM

149 each dNTP, 2 µM B-UP F/R Primers, 0.5 units of Q5 Hotstart High Fidelity DNA

150 Polymerase, 10 ng of Picogreen-quantified, gel-purified product as template, and water to 25

151 µL.

152

153 2.2.5 NRPS PCR amplification and barcoding

154 NRPS-encoding gene amplifications and NRPS data processing was performed by E. Zhang

155 (2016). Two rounds of PCR were performed prior to tag sequencing for NRPS domains

156 (Zhang 2016). We selected the primer set A3F and A7R (Table 2.2), again previously

157 reported for Actinomycetales by Ayuso-Sacido and Genilloud (2005), and shown to amplify

158 AD sequences from a range of bacterial phyla (Owen et al. 2013, Charlop-Powers et al.

159 2014). First-round PCR employed degenerate primers A3F/A7R with an additional 5’ block

160 and universal SMRT primer (UP) adaptor (UPF-A3F, UPR-A7R) (Table 2.2) (Ayuso-Sacido

161 & Genilloud 2005, Pacific Biosciences 2015). Optimised thermocycler conditions were used

162 (Table 2.2), with mixtures comprising 10 µL 5x Q5 Buffer, 10 µL 5x Q5 High G+C

163 Enhancer, 1.5 mM MgCl2, 0.2 mM each dNTP, 16.25 µL water, 0.5 µM each Primer (UPF-

164 A3F, UPR-A7R), 1 unit Q5 Hotstart Hi-Fidelity DNA Polymerase, and 5 µL of 1:10 dilution

165 DNA (~5-10 ng/µL). PCR products were visualised on 2% (w/v) agarose gel, and target

166 amplicons (~700 bp length) extracted and purified using the Zymoclean Gel DNA Recovery

167 kit. Prior to barcoding, NRPS PCR products were NanoDrop quantified and pooled in

51

168 equimolar amounts representing the start (0, 2 m), middle (100, 102 m) and end (200, 202

169 m) of each transect (Fig. 2.2) for all positive sites.

170

171 Gel-purified first-round PCR products were used as templates for the second-round

172 barcoding PCR, as described for PKS, under thermocycler conditions outlined in Table 2.2.

173 Second-round NRPS PCR reaction mixtures contained 5 µL 5X Q5 Buffer, 5 µL 5X Q5 High

174 G+C Enhancer, 1.5 mM MgCl2, 0.2 mM each dNTP, 2 µM B-UP F/R primers, 0.25 units Q5

175 Hotstart Hi-Fidelity DNA Polymerase and ~ 2-8 ng of gel-purified PCR product, and water

176 up to 25µL.

177

178 2.2.6 Natural product amplicon library preparation for SMRT sequencing

179 PKS and NRPS barcode-tagged PCR products were gel-extracted, Picogreen-quantified, and

180 pooled into two libraries. Libraries were submitted to The Ramaciotti Centre for Genomics

181 (UNSW Sydney, NSW, Australia) for SMRTbell library preparation and multiplexed SMRT

182 sequencing on the PacBio RS II (P4/C2) platform, employing one SMRT cell per library

183 (Pacific Biosciences).

184

185 2.2.7 Processing PacBio SMRT sequencing data

186 Demultiplexed SMRT sequencing output was assessed for read quality using FastQC

187 (Andrews 2010). Processing was performed using the QIIME (v 1.9.1) UPARSE pipeline

188 (Caporaso et al. 2010, Edgar 2013). Barcode labels were assigned, and individual reads

189 concatenated. Sequences were quality processed, dereplicated, and chimeras removed, to

190 generate a unique set of sequences, which were clustered at 95% similarity for generation of

191 amplified sequence variants (ASV) tables. 52

192

193 2.2.8 Taxonomic classification of sequences using the BLAST database

194 PKS and NRPS ASVs were analysed using both the BLASTn and BLASTx algorithms

195 (Altschul et al. 1990). Identical BLAST results for multiple ASVs were manually combined

196 to generate a new set of ASVs for each dataset (Appendix Tables A1.1, A1.2). A

197 representative nucleotide sequence was selected based on longest read length match. Relative

198 abundances at genera level for all PKS and NRPS positive sites were calculated by total and

199 visualized as bubbleplots using the ggplot2 package in R 3.4.0 (Wickham 2011) (Figs. 2.5,

200 2.6).

201

202 2.2.9 Multivariate data analysis

203 Sites containing > 1 positive sample were included in multivariate analyses, which were

204 performed using PRIMER v7 with the PERMANOVA+ add on feature (Clarke & Gorley

205 2015). NP genes and bacterial 16S abundance datasets (van Dorst et al. 2014, Bissett et al.

206 2016) were square-root transformed and standardised by total to generate Bray-Curtis

207 dissimilarity matrices. Non-metric multidimensional scaling (nMDS) plots were generated

208 for NP genes and visualised in 3D space (Fig. 2.13). Principal component ordination (PCO)

209 plots were created for bacterial 16S data (Fig. 2.14A). Transformed, normalised soil physical

210 and chemical parameters were used to create a Euclidean distance resemblance matrix for

211 PCO analysis (Fig. 2.14B) (Clarke & Gorley 2015).

212

213 For rarefaction curve analysis, subsampling of ASVs was performed without replacement to

214 the lowest number of sequence reads per site (4000 PKS and 800 NRPS reads), at a step size

215 of five, using a loop script of the rarefy function in the vegan package v2.4-3 in R 3.3.0 (Work 53

216 et al. 2010, Oksanen et al. 2017). Rarefaction curves including standard error were visualised

217 using the ggplot2 package v2.2-1 in R 3.3.0 (Fig. 2.4) (Wickham 2009).

218

219 Mantel tests were performed between the PKS/NRPS ASV and corresponding bacterial 16S

220 Bray-Curtis resemblance matrices using the RELATE function in PRIMER v7 with 999

221 permutations (Clarke & Gorley 2015).

222

223 2.2.10 Statistical analysis

224 To calculate the effect of soil fertility parameters on presence/absence of PKS and NRPS

225 gene amplification, a generalised linear mixed model (GLMM) was selected to account for

226 the binary nature of our data using the ‘lme4’ package in R 3.3.0 (Bates et al. 2015). P-values

227 were calculated for log TC, log TN and DMF effects using the bootstrap option (n = 1000),

228 with an expected significance level of P < 0.05. The significance of each soil fertility

229 parameter was tested both as separate, and paired models (Table 2.3). The relationship

230 between amplification of NP genes, log TC and DMF (%) for all samples was visualised as

231 a combined barchart and dotplot in R 3.3.0 using the ggplot2 package v2.2-1 (Fig. 2.11)

232 (Wickham 2009).

233

234 Chao1 estimates for PKS and NRPS gene richness were calculated in R 3.3.0, using the vegan

235 package v2.4-3 estimateR function (Oksanen et al. 2017). Statistical analyses of the

236 relationships between estimated Chao1 richness and selected soil parameters (TC, TN and

237 soil moisture (1-DMF)) for PKS and NRPS genes were carried out using the lm() function of

238 the ggplot2 package v 3.0-0 in R 3.51 (Wickham 2009). These were visualised as scatterplots

54

239 with linear regression lines, adjusted R squared values and significance level with a 0.05 p-

240 value cutoff (Fig. 2.12).

241

242 2.2.11 Construction of phylogenetic trees

243 Maximum likelihood trees with 1,000 bootstrap replications were constructed using PHYML

244 (Guindon et al. 2010), as part of the Phylogeny.fr pipeline (Dereeper et al. 2008).

245 Representative protein sequences were retrieved from GenBank (Benson et al. 2011).

246 Multiple sequence alignment was performed using MUSCLE (Edgar 2004) in full processing

247 mode, passed through PHYML and visualised in iTOL (Figs. 2.7 & 2.8) (Letunic & Bork

248 2016). Predictions about the type of compounds produced by our ASVs were made by

249 uploading phylogenetic tree representative sequences to the Natural Product Domain Seeker

250 database (NaPDoS) (Ziemert et al. 2012).

251

55

252 2.3 RESULTS

253 2.3.1 PKS and NRPS gene sequences compared across polar soils

254 Of the 216 polar soils analysed, 59 produced PKS PCR amplicons. Four Antarctic sites; BP,

255 HV, OW and CS, did not produce PCR amplicons under the optimised conditions used. PKS

256 sequences were recovered from multiple soil samples for three Antarctic sites in the Windmill

257 Island region MP, RR and HI, and single soil samples from all three high Arctic sites (AFH,

258 SS and SV). Sequences were not recovered from the four sites in the Vestfold Hills region;

259 AF, RL, HV and OW. In total, 23,240 circular consensus sequence (CCS) sequences were

260 retrieved, with an average predicted sequencing accuracy of 97%, length of 1383 bp, and

261 G+C content of 69%. Sequence processing resulted in 292 KS/AT domain ASVs, including

262 singletons. Subsequent BLAST analysis, manual ASV combination and removal of

263 singletons resulted in 82 KS/AT domain ASVs (Appendix Table A1.1).

264

265 For the NRPS genes, PCR amplicons were produced for 137 of the 216 samples examined.

266 NRPS sequences were recovered for all nine Antarctic sites but only one positive sample for

267 the human-impacted CS. The three high Arctic sites, AFH, SS and SV, did not yield any

268 NRPS sequences. A total of 19,596 CCS reads were obtained, with a mean predicted

269 sequencing accuracy of 97%, length of 805 bp, and G+C content of 69%. Sequence

270 processing resulted in 1,669 NRPS AD domain ASVs including singletons. Following

271 BLAST analysis, manual ASV combination and the removal of singletons, 144 unique AD

272 domain ASVs remained (Appendix Table A1.2).

273

56

274 2.3.2 PKS and NRPS biosynthetic diversity in polar soils

275 PKS diversity ranged between 2-35 ASVs per site, being particularly low in the single-

276 sample high Arctic sites (2-7 ASVs). These sites were consequently excluded from

277 multivariate analysis. For the three PKS positive Antarctic sites, MP, HI and RR, diversity

278 was comparable, with 30, 32 and 35 ASVs respectively. Sequencing depth did not reach

279 asymptote (Fig. 2.4A), indicating greater sequencing depth would capture further Type I PKS

280 diversity at these locations.

281

282 For NRPS, between 6 and 56 NRPS ASVs were recovered per site. Rarefaction curves neared

283 asymptote, indicating the sampling strategy provided adequate coverage of diversity (Fig.

284 2.4B). The human-impacted CS exhibited the lowest NRPS diversity with only 6 ASVs and

285 was subsequently removed from multivariate analysis. The greatest diversity was observed

286 at BP, while all four Vestfold Hills sites; AF, HV, OW and RL, displayed relatively high

287 NRPS diversity.

57

Figure 2.4 Capture of natural product diversity in polar soils. (A) For PKS,

sequencing depth did not quite capture total diversity. (B) For NRPS, rarefaction curves

are nearing or reaching horizontal asymptote, indicating sufficient sequencing depth.

288

289 2.3.3 Classification and distribution of natural product gene cluster families

290 Retrieved PKS sequences were assigned to nine phyla, including Actinobacteria (84%),

291 Proteobacteria (4%), Cyanobacteria (3%), and Bacteroidetes (1%). KS/AT primers also

292 amplified genes from Deinococcus-Thermus (1%), Chloroflexi (< 0.1%), Nitrospirae (<

293 0.1%), and Gemmatimonadetes (< 0.1%). Furthermore, 7% of all sequences were assigned

294 to dehydratases within several Euryarchaeota genera (Fig. 2.5). Overall, 66% of the

295 sequenced reads corresponded to KS/AT domains, the remainder were characterised as

58

296 phosphatases (25%), dehydratases (7%), a transposase (< 1%), a primase/polymerase (< 1%),

297 epoxide hydrolase (< 0.1%), oxidoreductase (< 0.1%) and uncharacterised proteins (< 0.1%)

298 (Appendix Table A1.1). As many BGCs contain these domains they were retained in the

299 analysis (Donadio et al. 2007, Li et al. 2008, Migita et al. 2009, Aparicio 2003). PKS gene

300 sequences matched > 25 known NP biosynthesis pathways (Appendix Table A1.1), primarily

301 antifungals (pimaricin, heronamide, antifungal L-155,175, ambruticin), and to a lesser extent

302 antibiotics (simocyclinone, quartromycin, thuggacin and rubradirin) and antiparasitics

303 (lobosamide and indanomycin) (Appendix Table A1.1) (Aparicio 2003, Schulze et al. 2015).

304

59

305

60

Figure 2.5 PKS domain sequence taxonomy by genera and phyla, assigned through BLASTx analysis. Bubble size represents relative abundance of total reads. KS/AT domains were sequenced from 22 samples, from six sites in total. The highest relative abundance and diversity was found in Antarctic sites MP, RR and HI. Some non-KS/AT amplicons were also identified, including DH domains from Euryarchaeota, which were amplified from all HI samples, and one MP sample. Interestingly, DHs are regularly found within PKS BGCs.

61

306 After protein sequence analysis, two known gene clusters remained; one with 64% similarity

307 to simocyclinone, an antibiotic with antitumour activity (Trefzer et al. 2002, Flatman et al.

308 2005), and the other exhibiting 35% similarity to quartromycin, a spirotetronate with activity

309 against human immunodeficiency virus (Wu et al. 2014) (Appendix Table A1.1).

310

311 NRPS gene sequences were assigned to nine bacterial phyla (Fig. 2.6), with the majority

312 belonging to the Actinobacteria (40%). Other established NRP-producers were also

313 represented, including Proteobacteria (22%), Cyanobacteria (19%) and Firmicutes (17%).

314 ASVs were also assigned to five phyla that are less commonly associated with NRP

315 production: Nitrospinae/Tectomicrobia (2%), Planctomycetes (1%), Chloroflexi (< 0.1%),

316 Armatimonadetes (< 0.1%) and Defferibacteres (< 0.1%). A high proportion (90%) of

317 recovered NRPS sequences corresponded to known AD domains (Appendix Table A1.2), as

318 well as hypothetical proteins (10%), hybrid NRPS-PKS (6%) and ATP-dependent acyl-CoA

319 ligases (< 0.1%). Nucleotide analysis of NRPS sequences revealed > 20 matches to known

320 bioactive compound gene clusters (Appendix Table A1.2). These included antitumour agents

321 (quinocarcin, collismycin, nannocystin), antibacterials (gramicidin, clorobiocin, teixobactin,

322 bacitracin), and antifungals (myxochromide, microsclerodermin) (Kawatani et al. 2016,

323 Raaijmakers et al. 2010, Schäberle et al. 2014). At the protein level, only one known bioactive

324 compound sequence match remained that was 49% similar to the antifungal surfactant, iturin

325 (Appendix Table A1.2).

62

Figure 2.6 NRPS domain sequence taxonomy by genera and phyla, assigned through

BLASTx analysis. Bubble size represents relative abundance of total reads. Domains were sequenced from nine sites. Many ASV communities were shared across Antarctic regions, except the human-impacted CS. Actinobacteria and Proteobacteria were the most dominant phyla, in accordance with 16S bacterial biodiversity (Fig. 2.9).

63

326 2.3.4 Phylogenetic analysis of NP domain sequences

327 Phylogenetic analysis confirmed the novelty of PKS gene families recovered, which were

328 distributed among three main branches (Fig. 2.7). Similarities to known bioactive compounds

329 included nystatin, an antifungal belonging to the same family of polyene macrolides as

330 pimaricin (Aparicio 2003), as well as the well-known anthelmintic, avermectin (Burg 1979);

331 and tetronomycin, a polyether tetronate antibiotic active against Gram-positive bacteria

332 (Keller-Juslén et al. 1982) (Fig. 2.7). One branch containing five sequences consisted entirely

333 of homologues to compounds with potent antitumour properties; calicheamicin and

334 epothilone (Nicolaou & Dai 1991).

335

336 Phylogenetic analysis confirmed a high level of novelty in NRPS sequences (Fig. 2.8). Of

337 particular note, 19 ASVs formed a monophyletic branch that contained only a single

338 characterised representative, the syringomycin biosynthetic cluster in the

339 Gammaproteobacterial genus Lysobacter. Interestingly, several other branches contained

340 syringomycin biosynthetic cluster sequences, albeit from a variety of genera (Fig. 2.8).

341 Syringomycin, gramicidin and iturin are peptides which exhibit both biosurfactant and

342 antibiotic properties (Raaijmakers et al. 2010). Other branches were comprised of matches

343 most similar to antibiotics (tyrocidine and bacitracin), cyanobacterial toxins (microcystin and

344 cyanopeptolin), and antitumour agents (actinomycin, bleomycin and epothilone) (Fig. 2.8)

345 (Ageitos et al. 2017, Nicolaou & Dai 1991, Faltermann et al. 2014).

346

64

Figure 2.7 Phylogenetic relationship of PKS protein sequences with reference bacteria based on BLASTx output. Evolutionary relationships were determined using the maximum likelihood method using MUSCLE and the

PHYML algorithm to perform ~1400 bp multiple sequence alignment and visualised in iTOL. Polar soil ASVs are indicated in bold. Bootstrap values < 0.5 have been collapsed. Branches of the tree generally group according to type of encoded biosynthetic compound, such as those with antitumour activity (yellow).

65

Figure 2.8 Phylogenetic relationship of

NRPS protein sequences against reference bacteria based on BLASTx output.

Evolutionary relationships were determined using the maximum likelihood method using

MUSCLE and the PHYML algorithm to perform ~700 bp multiple sequence alignment and visualised in iTOL. Polar soil

ASV sequences are indicated in bold.

Bootstrap values < 0.5 have been collapsed.

Biosynthetic genes encoding compounds with surfactant properties (syringomycin and gramicidin) are present on all branches of the tree (purple).

66

347 2.3.5 Bacterial and Actinobacterial diversity of polar soils

348 As observed previously, the polar soils examined comprised a high proportion of phyla

349 associated with NP biosynthesis, particularly Actinobacteria (16.6-42.8%), and

350 Proteobacteria (8.8-41.6%) (Fig. 2.9) (van Dorst et al. 2014, Ferrari et al. 2015). In terms of

351 overall bacterial diversity, the Windmill Islands sites MP and RR were most similar, while

352 CS, which is human-impacted, showed the lowest similarity to any other soil sample.

353 Interestingly, BP communities departed from the regional patterns observed, and were more

354 similar to the high Arctic sites.

355

Figure 2.9 Soil bacterial diversity observed from 16S amplicon sequencing of soil

from each of the 12 sites analysed. Phyla level diversity revealed soils at all sites to be

dominated by NP-producing phyla, in particular the Actinobacteria and Proteobacteria.

356

67

Figure 2.10 Actinobacterial diversity by Order and Family, observed from 16S

amplicon sequencing. (A) Actinomycetales were the dominant order at CS,

Solirubrobacterales dominated MP and RR, and MC47 was prominent within BP and the

high Arctic sites (SS, SV, AFH). In contrast, the Vestfold Hills soils (AF, HV, OW, RL)

were dominated by Acidomicrobiales and Rubrobacterales. (B) At Family level,

Actinobacteria communities were comprised of a large unclassified proportion,

particularly for MP, BP, RR and the high Arctic sites.

357 68

358

359 Further analysis of Actinobacteria showed that, at order level, Actinomycetales dominated at

360 CS (Fig. 2.10A). However, this did not correspond to an increase in NRPS or PKS richness

361 (Fig. 2.6). Furthermore, no trend was observed between NP gene richness and relative

362 abundance of Actinobacteria or Actinomycetales in these soils. At the family level, a large

363 proportion of Actinobacteria were unclassified (Fig. 2.10B), particularly those from the

364 Antarctic sites MP, BP and RR, and the three high Arctic sites.

365

366 2.3.6 Relationships between polar natural product genes, microbiomes and soil

367 fertility parameters

368 NP gene amplicons were not detected in 25% of our samples, most notably the more fertile

369 high Arctic sites (Fig. 2.11). Drier, lower carbon soils were more likely to result in

370 amplification of the PKS and NRPS-coding genes targeted with the primer sets employed

371 here (Table 2.3, Fig. 2.11). For PKS, a significant (P < 0.05) correlation was observed with

372 DMF (P < 0.001) (Table 2.3), while significant correlations were observed between NRPS

373 genes and TC (P < 0.001), TN (P < 0.001), and DMF (P < 0.001), with carbon being the

374 most important factor associated with a lack of NRPS gene recovery (Table 2.3). Soils

375 exhibiting < 75% DMF were negative for PKS amplicons, while those < 80.8% DMF were

376 negative for NRPS (Fig. 2.11). Soils > 36,410 ppm TC were negative for the recovery of PKS

377 amplicons while those comprising > 18,490 ppm TC were NRPS negative (Fig. 2.11).

378

69

Figure 2.11 Natural product gene amplification revealed significant relationships with soil carbon (A), and dry matter fraction (DMF) (B).

Drier, lower carbon soils were more likely to be positive for PKS and NRPS-coding genes. Carbon was most statistically significant for NRPS

(P < 0.001), while only DMF was significant for PKS (P < 0.001).

379

70

Table 2.3 Analyses of relationship between natural product gene presence and total

carbon (TC), total nitrogen (TN) and dry matter fraction (DMF).

PKS NRPS Separate Model Log TC 0.090 < 0.001 Log TN 0.733 < 0.001 DMF < 0.001 < 0.001 Paired Model Log TC 0.027 < 0.001 DMF 0.001 0.088 Log TC - 0.001 Log TN - 0.923 Log TN - < 0.001 DMF - 1 Numbers in bold are significant 380

381

382 For the sequenced NP communities, PKS Chao1 richness was significantly negatively

383 correlated with soil moisture (1-DMF) (Fig. 2.12A), while NRPS Chao1 gene richness

384 estimates displayed significant negative correlation with soil fertility factors carbon and

385 nitrogen (Fig. 2.12B & C).

386

71

Figure 2.12 Natural product gene association with soil fertility factors revealed significant (P < 0.05) negative correlations. (A) PKS gene richness as a function of soil moisture. (B & C) NRPS gene richness (Chao1) as a function of carbon (B) and nitrogen

(C).

72

387

Figure 2.13 Natural product gene nMDS analysis. (A) PKS and (B) NRPS. ASV

communities have clustered according to their geographic region. Windmill Islands sites

BP and HI form individual clusters, while MP and RR group together. Vestfold Hills sites

(AF, HV, OW & RL) form a grouped cluster.

388

389 Non-metric multidimensional scaling (nMDS) ordination plots showed that the NP gene

390 sequence communities obtained were more similar within, rather than between sites (Fig.

391 2.13). In both PKS and NRPS analyses, the Windmill Islands sites MP and RR clustered

392 together, while BP and HI formed individual clusters (Fig. 2.13A & B). For NRPS, regional

393 clustering was observed, with distinct groups of assemblages forming for Windmill Island

394 and Vestfold Hills sites (Fig. 2.13B). Similar relationships were observed in both the 16S

395 rDNA gene bacterial communities (Fig. 2.14A), and environmental parameters (Fig. 2.14B).

396 Indeed, significant correlations (P = 0.001) were found between NP gene diversity and

397 bacterial diversity (Mantel r = 0.615 (PKS), 0.81 (NRPS)).

73

398

Figure 2.14 Bacterial community 16S rDNA gene analysis and measured soil

parameters show clustering similarities. Principle Coordinates analysis (PCO) reveal

geographically distinct groupings for both 16S Bacterial diversity (A) and environmental

parameters (B)

399

74

400 2.3.7 NP domain sequence novelty

401 For all sites, except CS, the majority of NRPS gene families recovered were novel, sharing

402 low homology (< 70%) to genes that synthesise known NPs (Fig. 2.15A). In particular, the

403 Vestfold Hill sites HV and AF contained the greatest number of novel sequences (92% and

404 84%, respectively). For PKS (Fig. 2.15B), excluding the low diversity sites (AFH, SS & SV),

405 the highest number of novel (< 70% similarity) PKS sequences were retrieved from Windmill

406 Island site, HI (91%).

407

Figure 2.15 Natural product domain sequence novelty when compared to known

secondary metabolite protein sequences for NRPS (A) and PKS (B). The majority of

NP gene sequences that were recovered exhibited < 70% sequence identity to known

NRPS or PKS protein sequences, indicating a high potential for novel compound

production by bacteria in these polar soils.

408

75

409 2.4 DISCUSSION

410 Exploration of NP-encoding gene sequences using long-read technology revealed intriguing

411 functional groupings for both PKS and NRPS ASVs in polar desert soils. Antarctic NRPS

412 AD domain sequences predominantly clustered with biosurfactant-like lipopeptide and

413 decapeptide BGCs in phylogenetic analysis, specifically syringomycin and gramicidin (Fig.

414 2.8). Biosurfactant peptides are versatile metabolites, with roles in cell motility, cation

415 chelation, soil-water distribution, biofilm formation, sporulation, and degradation of

416 hydrocarbons, in addition to antibiotic activities (Raaijmakers et al. 2010, Fechtner et al.

417 2011). Biosurfactant production is common in cold-adapted microorganisms, particularly in

418 Bacillus, Burkholderia, Pseudomonas, Rhodococcus and Sphingomonas (Perfumo et al.

419 2018), genera which have been previously recovered from eastern Antarctic soils (Pudasaini

420 et al. 2017, Nicetic 2016, Wong 2018). While further work is required to confirm

421 biosurfactant secretion from bacterial isolates, the widespread occurrence of biosurfactant

422 genes in polar soil bacteria would suggest they provide a competitive advantage through

423 enhancement of water and nutrient bioavailability.

424

425 For PKS KS/AT, many ASVs demonstrated closest homology to a variety of polyene

426 macrolide antifungal agents, such as pimaricin and nystatin (Appendix Table A1.1, Fig. 2.7),

427 compounds which are hypothesised to provide Actinobacteria with a competitive edge over

428 fungi in soil environments (Aparicio 2003). Macrolides interact with the major cell

429 membrane sterol in fungi; ergosterol, affecting membrane integrity and inhibiting transport

430 of amino acids and glucose across the membrane (te Welscher et al. 2012, Sant et al. 2016,

431 Aparicio et al. 2016).

432

76

433 Biosynthetic gene richness has been previously associated with low carbon and low soil

434 moisture content across a range of soil biomes (Charlop-Powers et al. 2014). Here we found

435 carbon, nitrogen and moisture content to be correlated with both the detection and diversity

436 of our targeted NP-encoding gene sequences (Figs. 2.11 & 2.12, Table 2.3), with drier, more

437 nutrient-starved soils more likely to yield greater amplification and diversity of PKS and

438 NRPS gene sequences across both poles, using the degenerate primer sets employed in this

439 study (Tables 2.1 & 2.2). Interestingly, NP-encoding genes were either not successfully

440 recovered or exhibited the lowest diversity in polar soils with the greatest anthropogenic

441 influence, including high Arctic Svalbard sites (SS and SV), and eastern Antarctic site CS.

442 This is contrary to the relatively high abundances of Actinomycetales, the leading NP-

443 producing bacterial order, reported at these sites (Fig. 2.10A) (Bérdy 2005). The correlation

444 of NP genes with low-nutrient soils supports their ecological relevance and functional

445 usefulness regarding competition between microbes for limited resources (de Pascale et al.

446 2012).

447

448 While the threshold for determining functional gene novelty is disputable, some studies have

449 stated that a sequence identity < 70% is considered novel for secondary metabolite genes

450 (Busti 2006, Komaki et al. 2008). The majority (79.6%) of retrieved NP-encoding sequences

451 were under this threshold (Appendix Tables A1.1 & A1.2; Fig. 2.15), indicating value for

452 novel metabolite bioprospecting in eastern Antarctic soils. Additionally, our results revealed

453 potential for NP in rare, and previously unknown PK and NRP-producing phyla including

454 Nitrospirae, Armatimonadetes, Deinococcus-Thermus, Gemmatimonadetes and the

455 Euryarchaeota (Figs. 2.5 & 2.6, Appendix Tables A1.1 & A1.2) (Wang et al. 2014).

456

77

457 Here, we successfully employed long-read amplicon sequencing technology to capture large

458 PCR domain fragments (PKS ~1400 bp and NRPS ~700 bp), allowing translation into amino

459 acid sequences and enabling functional taxonomic predictions (Fig. 2.7 and 2.8). Through

460 our screening efforts we established a number of sites for future novel natural product

461 bioprospecting, with particularly exciting targets being arid soils of the Windmill Islands

462 region (HI, MP and RR) of eastern Antarctica and hyper-arid soils from the Vestfold Hills

463 (AF, RL and HV), which contained the highest diversity of potentially novel natural products.

464 We conclude that our sequencing approach is an advance for screening analyses of large gene

465 fragments such as PKS and NRPS.

78

79

CHAPTER THREE

3 CULTURING COLD ADAPTED BACTERIA FROM

MAJOR NATURAL PRODUCT PRODUCING PHYLA

USING NOVEL APPROACHES

1 3.4 INTRODUCTION

2 Culture-dependent approaches are known to vastly underestimate soil microbial diversity

3 (Amann et al. 1995, Cary et al. 2010, Ferrari et al. 2008, Lewis 2013). However, for NP

4 discovery microbial isolation remains critical to downstream analysis (Milshteyn et al.

5 2014, Katz & Baltz 2016). Rarely-cultured members of the dominant NP phyla (the

6 Actinobacteria, Proteobacteria, Firmicutes and Cyanobacteria) are thought to represent

7 the greatest potential for novel bioactives, along with other rare and as-yet-uncultured

8 divisions, estimated to contain a wealth of hidden chemical diversity (Lewis 2013, Müller

9 et al. 2015). Traditional culturing methods which rely on serial liquid dilutions and plating

10 to nutrient-rich artificial media (Zengler 2009) do not usually prove successful for

11 capturing novel taxa, even within the well characterised Actinobacteria and

12 Proteobacteria phyla (Jensen & Mafnas 2006, Nichols et al. 2010a, Janssen et al. 2002).

13

14 The Actinomycetales genus Streptomyces has historically provided the richest source of

15 bacterial NPs (Bérdy 2005, Baltz 2007). In recent years, however, attention has turned to

16 another prolific but under-studied order; the Gram-negative Myxococcales (Masschelein

17 et al. 2017). These fascinating microorganisms belong to the Deltaproteobacteria, and are

18 ubiquitous

80

Figure 3.1 Under starvation conditions Myxococcales form conspicuous,

macroscopic fruiting bodies. (A) Myxococcus fulvus on soil crumbs. (B) Stigmatella

aurantiaca on wood particles. (C) M. stipitatus on wood particles. (D) M. virescens,

on rabbit dung. Magnification bar = 500 µm. Adapted from: Dawid (2000).

19

20 in soil, but to date, descriptions of cold-adapted members are rare (Wenzel & Müller

21 2009, Herrmann et al. 2017, Dawid et al. 1988). Myxococcales show predatory, co-

22 operative social behaviour, and swarm toward food sources using slime secretion and

23 gliding motility, analogous to snail movement (Wenzel & Müller 2009). Prey comprise

24 organic macromolecules, which includes other microorganisms (Wenzel & Müller 2009,

25 Dawid 2000). Under starvation conditions, colonies develop into conspicuous fungi-like

26 fruiting bodies 50-500 µm in size (Fig. 3.1) (Wenzel & Müller 2009, Shimkets et al. 2006,

81

27 Dawid 2000). Traditional culturing techniques often over-look the Myxococcales, which

28 are outcompeted by more abundant, faster-growing species (Shimkets et al. 2006).

29 Myxococcales are usually isolated directly from environmental samples such as soil,

30 wood and animal dung, exploiting the taxa's unique features; such as the formation of

31 fruiting bodies, which are easily visualised and give rise to predatory cells which swarm

32 toward a bait source, typically comprising bacteria, yeast or cellulose (Shimkets et al.

33 2006, Karwowski et al. 1996, Dawid et al. 1988, Gaspari et al. 2005).

34

35 The results from PKS and NRPS domain sequencing in Chapter 2 indicated that the most

36 exciting novel NP biomining targets were pristine eastern Antarctic soils with low soil

37 fertility factors (Figs. 2.5, 2.6, 2.12, 2.15). Three of those pristine soils were selected here

38 for culturing based on PKS/NRPS gene findings: HI, MP and RL. Selected sites were

39 particularly low in carbon and moisture, they displayed a high level of novelty and

40 diversity of biosynthetic domain sequences and had high relative abundances of

41 Actinobacteria and Proteobacteria. Additionally, regional clustering had been observed

42 in multivariate analyses in Chapter 2; thus, soils were selected that represented the three

43 main clusters (Fig. 2.13, 2.14). Specifically, HI was selected because it displayed the

44 lowest average carbon content of all sites (703 ppm) (Fig. 2.12B), was one of the driest

45 (av. 96% DMF) (Fig. 2.12A), and displayed a diversity of both PKS and NRPS domains

46 (Figs. 2.5, 2.6). Importantly, 91% of all PKS domain sequences from HI were deemed

47 novel (< 70% similarity) (Fig. 2.15). Of the pristine sites, HI exhibited the highest relative

48 abundance of Actinomycetales (Fig. 2.10A) and formed a unique cluster in all

49 multivariate analyses (Figs. 2.13, 2.14). The MP site contained a diversity of PKS and

50 NRPS biosynthetic domains, with around 80% of sequences novel (Figs. 2.5, 2.6, 2.15)

51 and formed a cluster with RR in all multivariate analyses (Figs. 2.13, 2.14). Out of MP

52 and RR, MP displayed the lowest average soil carbon (4050 ppm) and moisture (97%

82

53 DMF), and highest average proportions of Actinobacteria (30%) and Proteobacteria

54 (11%) (Figs. 2.9, 2.12). RL was selected to represent the Vestfold Hills regional cluster

55 (Figs. 2.13, 2.14), and it measured the lowest average carbon content of this region (1415

56 ppm) (Fig. 2.12B). On average, RL contained a high relative abundance of Proteobacteria

57 (19%) (Fig. 2.9). Culturing studies have not been previously reported for any of the

58 chosen sites. As a comparison to the three pristine sites, a fourth site, Wilkes Tip (WT),

59 situated close to CS (Fig. 2.1A), was selected to represent a human-impacted Antarctic

60 soil.

61

62 In this chapter, two non-traditional oligotrophic culturing methods were employed with

63 the aim to target rare and cold-adapted NP-producing bacteria, specifically the

64 Myxococcales and Actinomycetales (Bérdy 2005, Masschelein et al. 2017). The first

65 method was adapted from Myxobacterial cultivating methods and was named here direct

66 soil culturing (DSC). Soil was directly incubated on low nutrient media with additional

67 bait sources (Shimkets et al. 2006). The second method was the soil substrate membrane

68 system (SSMS) (Fig. 1.9A), a novel culturing approach which has previously enabled

69 recovery of new species of Proteobacteria, Actinobacteria and Bacteroidetes (van Dorst

70 et al. 2016, Ferrari et al. 2005). The SSMS has been found to enrich rarely-isolated taxa

71 such as Saccharibacteria (previously known as candidate division TM7), as well as rare

72 phyla shown to harbour biosynthetic gene clusters, including Gemmatimonadetes,

73 Chloroflexi, Chlorobi and Verrucomicrobia (Ferrari et al. 2005, van Dorst et al. 2016,

74 Wang et al. 2014). Here, the SSMS was employed under psychrophilic incubation

75 temperatures for the first time.

83

76 3.5 MATERIALS AND METHODS

77 3.5.1 Site description and soil characteristics

Figure 3.2 Antarctic soils used for culturing were selected from three pristine polar

deserts and one human-impacted site. (A) Herring Island (HI), HI/T2/200, (B) Mitchell

Peninsula (MP), MP/T2/200, and (C) Rookery Lake (RL), RL/T2/200, East Antarctica.

The fourth sample, (D) Wilkes Tip (WT), was collected from a site contaminated with a

variety of waste including fuel and domestic rubbish (Fryirs et al. 2013). Photographs

courtesy of AAD. 84

78 3.5.1.1 Herring Island

79 HI is an ice-free island, devoid of vascular plant life (Fig. 3.2A) and is composed primarily

80 of garnet-bearing granite gneiss rock (Paul et al. 1995, Bailey et al. 2016). The island is

81 remote from human activity, situated approximately 15 km south of Casey station (Fig. 2.1A),

82 and is frequented by weddell seals and a variety of petrel seabird species (AADC, 2018, Paul

83 et al. 1995, Bailey et al. 2016). The selected HI sample (HI/T2/200), was low in moisture,

84 carbon and nitrogen, combined with a near-neutral pH (Table 3.1), and in sequenced bacterial

85 diversity, showed remarkably high relative abundance of Actinobacteria (67%), followed by

86 Chloroflexi (14%) and Acidobacteria (5%) (Fig. 3.3).

87

Table 3.1 Location and soil characteristics for selected Antarctic soils.

HI MP RL WT AAD Barcode 36815 36809 120310 124573 Transect/Distance T2/ 200m T2/ 200m T2/ 200m Bulk soil Antarctic Region Windmill Is. Windmill Is. Vestfold Hills Windmill Is. Moisture (%) 3.2 4.6 0.06 11 * Total Carbon (ppm) 600 2042 1114 < 5000 * Total Nitrogen (ppm) 130 210 130 < 5000 * pH 6.6 5.2 7.4 5.3 * 66° 24' 41”S, 66° 18' 46”S, 68° 29' 34”S, 66° 15’ 35”S; Latitude/Longitude 110° 39' 30”E 110° 32' 4”E 78° 6' 47”E 110° 32’ 22”E Garnet- Garnet- Mossel gneiss Geological bearing bearing (orthopyroxene- Unknown composition granite gneiss granite gneiss quartz-feldspar) * estimates based on Chong et al. 2009

85

Figure 3.3 Bacterial 16S rDNA diversity for the three pristine samples cultured; HI,

MP and RL. All three soils have high relative abundance of Actinobacteria and

Chloroflexi phyla. Proteobacteria also make up a large proportion in the MP and RL

samples. The fourth site, WT, was not characterised by 16S gene sequencing.

88

89 3.5.1.2 Mitchell Peninsula

90 MP lies approximately 5 km south of Casey station (Fig. 2.1A) (AADC, 2018) and, like HI,

91 is an ice-free desert, formed from garnet-bearing granite gneiss (Fig. 3.2B) (Paul et al. 1995,

92 Bailey et al. 2016). Fauna have not been recorded at MP (Ji et al. 2016, Chong et al. 2009).

93 However, some vegetation in the form of a low diversity of lichens and bryophytes have been

94 described by Melick et al. (1994). The MP sample (MP/T2/200), was higher in carbon and

95 moisture content than HI, and was more acidic (Table 3.1). The sample's bacterial community

96 showed greater diversity than HI, including high proportion of candidate divisions (WPS-2

97 and AD3) (9% and 6% respectively), and was dominated by Actinobacteria (26%),

98 Chloroflexi (17%) and Proteobacteria (14%) (Fig. 3.3). 86

99

100 3.5.1.3 Rookery Lake

101 The RL sampling site (Fig. 3.2C) is located 1.7 km north-east from rookery lake; a circular

102 body of water situated on Long Peninsula, Vestfold Hills, approximately 9.3 km north of

103 Davis station (Fig. 2.1B). The lake supports several Adélie penguin colonies (AADC, 2018).

104 Long Peninsula is formed primarily from Mossel gneiss rock (Sheraton 1983). The selected

105 RL sample, (RL/T2/200), was hyper-arid, and exhibited a slightly alkaline pH (Table 3.1).

106 Sequenced bacterial 16S rDNA diversity revealed high relative abundances of Actinobacteria

107 (33%), Proteobacteria (18%), Chloroflexi (16%), and Gemmatimonadetes phyla (13%) (Fig.

108 3.3).

109

110 3.5.1.4 Wilkes Tip

111 The fourth soil, from WT, was collected as a single bulk sample in 2005 from a contaminated

112 site undergoing evaluation for bioremediation (Table 3.1, Fig. 3.2D), and is situated on Clark

113 Peninsula, approximately 3 km north from Casey station (Fig. 2.1A). WT was a former

114 rubbish disposal site for Wilkes station, a facility abandoned in 1969 (Fryirs et al. 2013). WT

115 is contaminated with a diversity of legacy waste including general domestic rubbish, fuel

116 drums, gas cylinders, batteries and mechanical items. The site is almost permanently covered

117 by snow and ice, except in years of extreme melt, which occur every 4-5 years (Fryirs et al.

118 2013, AAD, 2002). Unlike samples HI, MP and RL; the WT sample has not been analysed

119 for soil physical and chemical properties, nor bacterial 16S diversity. Estimations of WT soil

120 chemical properties were made here based on data from Chong et al. (2009), who analysed

121 soil from the same site (Table 3.1). Additionally, they reported bacterial diversity by

122 denaturing gradient gel electrophoresis (DGGE) fingerprinting of amplified 16S rDNA gene

87

123 fragments. Their results were dominated by Cytophaga–Flexibacter–Bacteroides phylum,

124 followed by Proteobacteria (Chong et al. 2009).

125

126 3.5.2 Direct soil culturing methods

127 3.5.2.1 Herring Island and Mitchell Peninsula DSC

128 For HI and MP DSC, two baiting methods were employed: Escherichia coli lawn, and

129 cellulose bait (Fig. 3.4). Two preparations were used for culturing soils, designated 'untreated'

130 and 'pretreated'. The untreated soils (1 g) were removed from -80°C storage and defrosted at

131 4°C, suspended in 500 µL of sterile Milli-Q water and briefly vortexed before use, while the

132 pretreated soils (0.5 g) were defrosted at RT (~21°C), air dried in covered petri plates at 37°C

133 for 30 min, then suspended in 3.5 mL sterile Milli-Q water. The soil-water suspension was

134 placed in an ultrasonicator (XUBA1, Grant, UK) 44 kHz for 1 min, then incubated in a water

135 bath at 56°C for 10 min (Karwowski et al. 1996). This pretreatment was hypothesised to

136 select for Myxococcales, whose spores are resistant to mild heat and sonication. Other spore-

137 forming bacteria such as Streptomyces spp. should also be similarly selected (Karwowski et

138 al. 1996, Daza et al. 1989).

139

140 For the pretreated soils, water agar plates (WCX) were prepared with 25 µg/ mL

141 cycloheximide (R&D Systems, Minneapolis) (Appendix A2.2), to suppress fungal growth

142 (Shimkets et al. 2006). For the untreated soil WCX plates, cycloheximide concentration was

143 doubled to 50 µg/ mL (Appendix A2.2). Four plates were prepared for each soil as follows:

144 a dense suspension of live E. coli ATCC 25922 was applied as either cross streak (Fig. 3.4A),

145 or circular lawns to WCX agar plates (Fig. 3.4C), and dried at RT. For the cellulose bait

146

88

147

Figure 3.4 Direct soil culturing using both E. coli lawn (A & C) and cellulose (B & D)

baiting methods on WCX agar plates. The soil preparations were either pretreated with

mild heat and sonication (A & B), or untreated (C & D).

148

149 methods, sterile 10 mm diameter Whatman® grade 1 filter paper discs were applied to WCX

150 agar either singularly (Fig. 3.4B) or in groups (Fig. 3.4D) (Dawid et al. 1988, Shimkets et al.

151 2006). Pea-sized portions (~10 mm diameter) of either the pretreated (Fig. 3.4A & B) or

152 untreated soil (Fig. 3.4C & D), were then applied to the surface of each bait, using a sterile

153 spatula. Plates were wrapped in parafilm and incubated at RT in the dark, for up to 8 months,

154 with small amounts of sterile water added periodically to maintain moisture.

155

89

156 3.5.2.2 Rookery Lake and Wilkes Tip DSC

157 RL and WT soils were cultured using a third Myxococcales culturing method, consisting of

158 E. coli baiting with the addition of rabbit dung pellets (Fig. 3.5). Herbivore dung has been

159 demonstrated as a favoured substrate for Myxococcales, with rabbit dung the most commonly

160 used in isolation studies (Shimkets et al. 2006). Prior to use, the dung pellets, collected from

161 an Australian property on the NSW/QLD border, were autoclaved at 121°C for 45 min,

162 cooled, soaked for 1 hr in cycloheximide solution (30 µg/mL) to inhibit fungal growth, and

163 aseptically dried. Duplicate WCX plates with 50 µg/mL cycloheximide were prepared with

164 circular E. coli lawns and portions of untreated soil preparation, as previously described (Fig.

165 3.4C). Rabbit dung pellets were moistened with liquid from the untreated soil preparation,

166 then embedded into soil portions on the WCX E. coli plate (Fig. 3.5) (Gaspari et al. 2005,

167 Shimkets et al. 2006). Plates were wrapped in parafilm and incubated at RT in the dark for

168 up to 7 months, with small amounts of sterile water added periodically to maintain moisture.

169

170

Figure 3.5 Direct soil culturing using the E. coli lawn method with the addition of

rabbit dung pellets. Culturing was performed using untreated soils on WCX agar plates.

90

171 3.5.2.3 Isolation and purification of bacteria from DSC

172 Following incubation, all DSC WCX plates were observed every 1-3 d under a

173 stereomicroscope (40x magnification), and light microscope (100x magnification), for

174 visualisation of Myxococcale-like fruiting bodies, or other visible colony formation. Visible

175 colonies were picked directly using microscopy and a sterile toothpick and sub-cultured onto

176 a variety of media: 0.75x Nutrient Agar (NA) (Oxoid, Thermo Scientific, Massachusetts),

177 soil extract with gellan gum (SEGG) (Appendix A2.1 & A2.2), and WCX agar with E. coli

178 or cellulose bait (Appendix A2.2). Purified isolates were maintained on 0.75x NA.

179

180 3.5.3 SSMS culturing at cold temperatures

181 In addition to DSC, the HI soil was selected for culturing via the SSMS, adapted from Ferrari

182 et al. (2008) (Fig. 3.6) with the aim of selecting for cold-adapted microorganisms. All

183 equipment and reagents were equilibrated to 4°C prior to use, and low temperatures (< 10°C)

184 were maintained throughout the entire experiments.

185

186 HI soil (16.5 g) was removed from -80°C storage and defrosted at 4°C. SSMS cultures were

187 prepared in triplicate. Tissue culture inserts (TCI) (Millicell®, 30 mm, polycarbonate, 0.4

188 µm, Millipore, Australia), were used to provide the soil substrate for bacterial growth. Each

189 TCI was prepared by gently vortexing 4.5 g HI soil with ~300 µL of 0.9% NaCl to form a

190 homogenous soil slurry which evenly covered the underside of the filter membrane (Fig.

191 3.6A). The slurry was then secured against the membrane by filling the remaining TCI space

192 with gellan gum (5 g/ L) (Gelzan™ Gelrite®, Sigma-Aldrich), and the TCI inverted into the

193 6-well culture plate and placed at 4°C while the inoculum was prepared; 3 g of HI soil was

194 added to 27 mL 0.9% NaCl and vigorously vortexed for 10 s. Large particles were allowed

91

195

196

Figure 3.6 Principles of the soil substrate membrane system (SSMS). (A) Tissue

culture inserts (TCI) are prepared with a soil slurry of the soil sample of interest. (B) A

polycarbonate membrane is inoculated with microbial suspension and applied to the outer

TCI filter. Nutrients for growth diffuse through the filters from the soil. (C) The TCI

replicates are incubated within a 6-well culture plate, with sterile water added to outer

wells to prevent drying. Source: Ferrari et al. (2008).

197

198 to sediment at 4°C for 1 min. A 1:100 dilution was then prepared by adding 100 µL of the

199 1:10 dilution to 900 µL 0.9% NaCl. For each triplicate culture, a 25 mm diameter, 0.22 µm

200 pore size, hydrophilic polycarbonate membrane (PCM) (Isopore®, Millipore) was placed

92

201 onto a moistened 25 mm diameter glass fibre filter (Whatman) on a sample filtration manifold

202 (Carbon 14 Centralen, Denmark) fitted with Millivac-Mini vacuum pump (Millipore). A 20

203 mL sterile stainless-steel cylinder was then secured and filled with 10 mL 0.9% NaCl and 50

204 µL of 1:100 innoculum and filtered onto PCM replicates. Each PCM was then applied to an

205 inverted TCI membrane, ensuring complete contact (Fig. 3.6B), and the TCIs inserted into

206 the 6-well plate. Sterile water was added to the plates outer wells to maintain hydration of

207 cultures (Fig. 3.6C). The plate was sealed with parafilm and incubated at 4°C, for a total of

208 162 d.

209

210 3.5.3.1 Assessing microcolony growth and bacterial viability on the SSMS

211 Microcolony growth and viability was assessed at 50, 78 and 162 d of incubation (Fig. 3.7A).

212 On each occasion, ¼ PCM from one TCI replicate was abstracted using a sterile razor blade

213 and secured to a microscope slide with 0.1% agarose. The PCM portion was treated with 1

214 drop (~25µL) of Vectashield mounting medium (Vector Laboratories, California), containing

215 a 1:1 ratio of Ultrapure™ water and the LIVE/DEAD® BacLight™ Bacterial Viability stain

216 (Invitrogen), and incubated at 4°C in the dark for 30 min. PCM portions were then observed

217 via epi-fluorescent microscopy using an Olympus BX51 microscope with DP74 camera

218 (Olympus, North Ryde, Australia), and filters appropriate for excitation/emission maxima of

219 480/500 nm for SYTO 9 and 490/635 nm for propidium iodide (PI). When stained with the

220 SYTO 9 and PI nucleic acid stains, live intact cells fluoresce green, while dead cells fluoresce

221 red.

93

94

Figure 3.7 Flowchart for bacterial cultivation using cold-incubated SSMS. (A) Portion of the PCM was removed and microcolony growth and viability assessed by epi-fluorescent microscopy using a live/dead stain. (B) Portion of the PCM was vortexed with saline to dislodge and suspend cells. (C) PCM was removed from saline and rubbed over surface of RAVAN media. (D) The cell suspension was serial diluted and spread-plated. (E) The cell suspension was passed into two rounds of enrichment in RAVAN liquid media and spread-plated. (F) Resulting colonies were sub-cultured to RAVAN media for purification, then nutrient media for maintenance.

95

222 3.5.3.2 Secondary cultivation of SSMS microcolonies using artificial media

223 In addition to microscopy at 50, 78 and 162 d incubation, ¼ size portions of PCM’s from

224 TCI replicates were placed into 1.5 mL tubes containing 1 mL 0.9 % NaCl and vortexed for

225 1 min to dislodge cells (Fig. 3.7B). PCMs were removed from the cell suspension and applied

226 directly over the surface of 8°C equilibrated RAVAN/ TSV/ GG plates (Appendix A2.1 &

227 A2.2) and wrapped in parafilm (Fig. 3.7C). The low-concentration culturing medium,

228 RAVAN, is designed to select for oligophilic bacteria (Watve 2000), and was adapted here

229 to target Actinomycetales and novel bacteria. RAVAN was prepared at 0.05x concentration,

230 modified with additional trace salt and vitamin solutions and gellan gum was used as the

231 solidifying agent (Appendix A2.1 & A2.2). Gellan gum was used as it may improve capture

232 of environmental bacteria, including novel phyla such as Gemmatimonadetes (Tanaka et al.

233 2014). Trace salts were added to provide electrolytes and minerals commonly used for

234 recovery of Actinomycetales and have been found to significantly promote sporulation in

235 Streptomyces (Shirling & Gottlieb 1966, Karandikar et al. 1996). The trace vitamin solution

236 was included because B-vitamins have been found to improve recovery of Actinomycetales

237 from environmental samples (Hayakawa & Nonomura 1987, Zotchev et al. 2008, Wolin et

238 al. 1963).

239 For the cell suspension (Fig. 3.7B), serial dilutions were made by addition of 100 µL cell

240 suspension to 900 µL 0.9% NaCl (Fig. 3.7D). Each of the dilutions, as well as the undiluted

241 cell suspension, were spread onto RAVAN/ TSV/ GG plates. Additional liquid culture

242 enrichments were made (Fig. 3.7E), whereby 10 µL of each cell suspension was added to 0.2

243 mL tubes containing 190 µL RAVAN/ TSV broth (Appendix A2.2). Following incubation at

244 8°C for 15-20 d, 100 µL aliquots of enrichments were plated onto RAVAN/ TSV/ GG. This

245 process was repeated for a total of two enrichments (Fig. 3.7E).

246 96

247 3.5.3.3 Isolation and purification of bacteria from SSMS cultures

248 Spread-plated cultures (Fig. 3.7) were regularly observed for growth, with incubation ranging

249 between 27-347 d at 8°C. Visible colonies were extracted using a 1µl sterile loop and sub-

250 cultured onto RAVAN/ TSV/ GG plates until pure colonies were obtained (Fig. 3.7F). Once

251 established in pure culture, isolates were tested for the ability to grow on 0.75x NA at 8°C,

252 followed by RAVAN/ TSV/GG and 0.75x NA at RT.

253

254 3.5.4 Gram and lactophenol cotton blue stain differentiation

255 Gram-staining was performed on all cultured isolates; for initial characterisation, to

256 determine purity, and to aid elimination of fungal isolates from further analysis (Beveridge

257 2001). A small portion of a single colony was removed with a sterile loop and emulsified

258 with 1 drop of sterile water on a glass slide and air dried. The smear was heat-fixed, and

259 flooded with Gram's crystal violet solution (Sigma-Aldrich) for 1 min. Slides were rinsed

260 with water, and Gram's iodine solution (Sigma-Aldrich) applied for 1 min. Smear was de-

261 colourised with 95% EtOH, then water, and flooded with dilute carbol fuchsin (Pro-Lab

262 Diagnostics) for 30 s. Smears were air dried, then visualised with oil-immersion light

263 microscopy.

264

265 Lactophenol cotton blue mounts were additionally performed on suspected fungal colonies

266 (Leck 1999). One drop of Lactophenol Cotton Blue stain (Sigma-Aldrich) was applied to a

267 glass slide. With minimal disruption, colonies were removed with a sterile loop, combined

268 with the stain and a coverslip applied. The wet mount was visualised by light microscopy.

269

97

270 3.5.5 Isolate DNA extraction and purification

271 Genomic DNA was extracted from isolates using a bead-beating approach followed by

272 ethanol precipitation. A single large bacterial colony was transferred to a 2 mL screw-top

273 microcentrifuge tube (Sarstedt AG and Co., Germany), containing 1 mL autoclaved Milli-Q®

274 water (Merck Millipore, Massachusetts), and 0.5 g of an equal proportion 0.1 mm and 0.5 mm

275 diameter glass beads (Mo Bio, Carlsbad). The mixture was homogenized using the FastPrep®-

276 120 homogenisation instrument (MP Biomedicals, California) for 40 s, on speed setting 6.0, and

277 incubated for 5 min at 95°C. Samples were centrifuged at 20,800 x g for 3 min, and DNA lysates

278 removed.

279

280 For ethanol precipitation, 1/10 volume of 3M sodium acetate (CH3COONa, pH 5.2) was added

281 to DNA lysates, followed by two volumes of ice-cold 100% EtOH. DNA was precipitated at 8°C

282 for 20 min, then centrifuged at 17,900 x g for 20 min, and supernatants discarded. Pellets were

283 re-suspended in 1 mL fresh 70% EtOH, and centrifuged at 17,900 x g for 5 min. Following

284 removal of supernatants, pellets were dried on a heat block for 15 min at 55°C, re-suspended in

285 150 μL TE buffer (10 mM Tris-HCl (pH 8.0), 0.1 mM EDTA). Genomic DNA lysates were

286 quantified using Nanodrop and stored at -20°C until further use.

287

288 3.5.6 PCR amplification and Sanger sequencing of isolate 16S rDNA genes

289 Taxonomic identification of strains was performed based on Sanger sequencing of the 16S

290 rDNA gene for selected strains. Near full-length 16S bacterial rDNA genes were PCR

291 amplified from gDNA using the primer set 27F/1492R (Table 3.2) (Integrated DNA

292 technologies, Singapore) (Lane et al. 1985). Reaction mixtures contained 5 µL 5x Green

293 Gotaq® Flexi Buffer (Promega, Wisconsin), 2.5 mM MgCl2, 0.2 mM each dNTP, 10% v/v

294 Dimethylsulfoxide (DMSO) (Sigma-Aldrich), 0.4 µM each primer, 0.625 units of GoTaq®

98

295 Hotstart DNA Polymerase (Promega), 3 µL of purified DNA template, and Ultrapure™ water

296 to 25 µL (Invitrogen). Amplification was performed in an MJ Mini™ Thermal Cycler (Bio-

297 Rad, Australia). Thermocycler conditions were as follows: 94°C for 2 min, 30 cycles of 94°C

298 for 30 s, 60°C for 30 s, 72°C for 90 s, final extension at 72°C for 5 min. PCR amplification

299 was confirmed via gel electrophoresis, with 10 µL PCR product loaded onto 2% (w/v) agarose

300 gel in Tris-acetate-ethylenediaminetetraacetic acid buffer (1 x TAE), with addition of 0.01%

301 SYBR safe DNA stain (Invitrogen). Gels were visualised using the Safe Imager™ 2.0 Blue Light

302 Transilluminator (Invitrogen).

303

Table 3.2 Primer sets employed for PCR targeting 16S bacterial rDNA, PKS and

NRPS domain fragments.

Primer Length Target Primer Sequence (5'-3') Reference Name (bp) 16S 27F AGAGTTTGATCMTGGCTCAG 1500 Lane 1985 rDNA 1492R TACGGYTACCTTGTTACGACTT K1F TSAAGTCSAACATCGGBCA 1200- PKS Ayuso- M6R CGCAGGTTSCSGTACCAGTA 1400 Sacido & A3F GCSTACSYSATSTACACSTCSGG Genilloud NRPS 700 A7R SASGTCVCCSGTSCGGTAS 2004 304

305

306 PCR products were submitted directly to The Ramaciotti Centre for Gene Function Analysis,

307 at UNSW Sydney (NSW, Australia), for purification and preparation for single-end

308 sequencing, using primer 1492R, on the Sanger ABI 3730 Capillary Sequencer (Applied

309 Biosystems, Australia).

310

311 Resulting FASTA sequences (~1200 bp) were visualised with FinchTV v1.4.0 trace viewer

312 (Geospiza, Washington, USA), quality trimmed to ~1000 bp, and compared with known gene

99

313 sequences in GenBank, using the BLAST search tool (Altschul et al. 1990). For isolates with

314 identical 16S gene sequences, two representative strains were chosen for further analysis.

315

316 3.5.7 Cryopreservation of strains

317 Two strains from each species from each site/method (HI, MP and RL DSC and HI SSMS)

318 were selected for cryopreservation in triplicate using the Microbank™ cryovial bacterial

319 storage system (Pro-Lab Diagnostics, Canada). Pure cultures in exponential growth phase on

320 solid media were aseptically transferred to cryovials and the tubes inverted for ~30 s to allow

321 binding of bacterial cells to supplied beads. Excess liquid was removed, and tubes were

322 stored at -80°C until further use.

323

324 3.5.8 Type I PKS and NRPS domain screening by PCR

325 Isolate genomic DNA was screened for presence of Type I PKS and NRPS genes, targeting the

326 conserved KS/AT and AD domains. Each 50 μL reaction comprised 10 μL 5X Green GoTaq®

327 Flexi Buffer (Promega), 0.2 mM each dNTP, 2.5 mM MgCl2, 10% v/v DMSO, 0.8 μM each

328 primer (PKS: K1F/M6R or NRPS: A3F/A7R) (Table 3.3), 1.25 units of GoTaq® Flexi DNA

329 Polymerase (Promega), 18.75 μL Ultrapure™ water, and 5 µL purified gDNA.

330

331 Thermocycler conditions for Type I PKS comprised 94°C for 2 min, 30 cycles of 94°C for 30 s,

332 55°C for 30 s, 72°C for 2 min, and final extension 72°C for 5 min; for NRPS; 95°C for 5 min, 30

333 cycles of 95°C for 30 s, 59°C for 30 s, 72°C for 4 min, and final extension at 72°C for 10 min.

334 The positive control was purified genomic DNA from the Type I PKS and NRPS positive

335 Streptomyces strain, CZ24 (van Dorst et al. 2017).

336

100

337 3.5.9 In situ antimicrobial testing by cross-streak method

338 Strains were screened for antimicrobial activity using the cross-streak agar method (Carvajal

339 1947, Hopwood 2007, Kamat & Velho-Pereira 2011). This is a relatively rapid screening

340 assay to establish antimicrobial activity from the isolate in situ and provides semi-

341 quantitative results (Kamat & Velho-Pereira 2011). Strains were inoculated in triplicate onto

342 NA as a central streak using a sterile 1 µL loop (Fig. 3.8). Plates were incubated at RT for 1-

343 7 d depending on genus, to allow sufficient growth and production of active compounds.

344

345 Test pathogens comprised five opportunistic human pathogen strains commonly utilized in

346 antibiotic sensitivity testing (ATCC 2014). They included a selection of Gram-positive

347 pathogens: Staphylococcus aureus ATCC 25923 and Bacillus subtilis ATCC 11774; Gram-

348 negative pathogens: E. coli ATCC 25922 and Pseudomonas aeruginosa ATCC 27853; and

349 one fungal pathogen, Candida albicans ATCC 10231. Test pathogens were streaked from

350 the edge of the plate to the polar isolates in perpendicular lines using a 1 µL sterile loop (Fig.

351 3.8). Plates were incubated for a further 1-4 d at RT, and the zone of inhibition measured.

352 Negative controls consisted of pathogens streaked in an identical way with no isolate,

353 positive controls were by disc diffusion method (Bondi et al. 1947), whereby a small portion

354 of test pathogen colony was inoculated into 1 mL phosphate-buffered saline (PBS), spread-

355 plated onto NA, and allowed to dry. Discs infused with tobramycin (30 µg/ mL) (Bio-Rad,

356 California) were applied to the bacterial lawns, while amphotericin B discs (10 µg/ mL)

357 (Sigma-Aldrich, Missouri) were applied to C. albicans lawns, and the plates incubated at RT

358 for 48 h before measurement of zones of clearing.

359

101

Figure 3.8 Pattern of inoculation for cross-streak agar assay. Test pathogens were

inoculated perpendicular to each polar isolate analysed.

360

361 3.5.10 Type I PKS and NRPS domain screening and antimicrobial assays for strains

362 isolated in previous studies

363 An additional 20 strains, which had been previously isolated from eastern Antarctic sites

364 Browning Peninsula (BP), Robinson Ridge (RR) and Wilkes Tip (WT) (Fig 2.1A) by

365 colleagues (Pudasaini et al. 2017, Nicetic 2016), were selected here to undergo Type I PKS

366 and NRPS domain screening and cross-streak antimicrobial assay. These strains comprised

367 14 Actinobacteria, four Proteobacteria and two Firmicutes (Table 3.3).

368

102

Table 3.3 Strains isolated in previous studies which were screened for PKS and

NRPS domains and antimicrobial activity.

Site Strain Closest cultured representative Phylum ID (%) INR13 Azospirillum zeae α-Proteobacteria 100 INR4 Bacillus aryabhattai Firmicutes 100 INR6 Burkholderia jiangsuensis β-Proteobacteria 99 RR INR15 Frondihabitans australicus Actinobacteria 99 INR9 Leifsonia shinshuensis Actinobacteria 99 INR17 Mesorhizobium qingshengii α-Proteobacteria 100 INR7 Streptomyces spororaveus Actinobacteria 99 INWT7 Cryobacterium mesophilum Actinobacteria 99 INWT5 Methylobacterium brachiatum α-Proteobacteria 99 WT INWT6 Quadrisphaera granulorum Actinobacteria 99 INWT3 Rhodococcus aerolatus Actinobacteria 99 SPB151 Kribbella sandramycini Actinobacteria 99 SPB164 Mycobacterium fluoranthenivorans Actinobacteria 99 SPB16 Paenisporosarcina macmurdoensis Firmicutes 99 SPB1 Rhodococcus yunnanensis Actinobacteria 100 BP SPB167 Streptomyces abikoensis Actinobacteria 99 SPB35 Streptomyces beijiangensis Actinobacteria 99 SPB162 Streptomyces fildesensis Actinobacteria 99 SPB13 Streptomyces indigoferus Actinobacteria 99 SPB4 Actinobacteria 99 RR: Robinson Ridge, WT: Wilkes Tip, BP: Browning Peninsula 369

370

371 3.5.11 Bacterial 16S rDNA gene analysis for pristine soils

372 Bacterial 16S rDNA sequencing data previously described in Chapter 2 (Section 2.2.2), was

373 analysed for the three pristine soil samples used in this chapter, and visualised as relative

374 abundance by phyla (Fig. 3.3) in R 3.4.0 using the ggplot2 package 2.2.1 (Wickham 2009).

375

103

376 3.5.12 Venn diagram visualisation of species shared between sites

377 Cultured bacterial species shared across sites by DSC and SSMS methods were calculated

378 and visualised as a Venn diagram in R 3.4.0 with the VennDiagram package 1.6.17 (Fig.

379 3.15) (Chen & Boutros 2011).

380

381 3.5.13 Biotechnological and biosynthetic potential of isolates

382 Based on overall results, a number of isolates were chosen to undergo whole genome

383 sequencing in Chapter 4. Isolates were prioritised based on the following criteria:

384 • Antimicrobial activity,

385 • Presence of NP domains,

386 • 16S rDNA gene sequence identity to known species < 99%,

387 • Rarely-cultured Actinobacteria and Proteobacteria groups,

388 • High quality genome assembly for the species absent from the genome taxonomy

389 database (GTDB) (http://gtdb.ecogenomic.org/),

390 • Relevance to other ongoing Antarctic microbial research, including known

391 hydrocarbon degrading genera with potential for bioremediation, and pigmented

392 strains, which are commonly associated with NPs and are valuable to diverse

393 industries.

394

395 3.6 RESULTS

396 3.6.1 Direct soil culturing

397 At 8 d incubation, mycelium-like microcolonies were observed extending out from soil

398 particles and into the WCX agar (Fig. 3.9). Further sub-culturing and analysis revealed these

399 to be Streptomyces species. During the ~8 month incubation period, visible colonies were

104

400 sub-cultured from the surfaces of agar (Fig. 3.10A), soil particles (Fig. 3.10B) and dung

401 pellets (Fig. 3.10D).

402

Figure 3.9 Substrate mycelium-like filaments were observed by microscopy of direct

soil cultures. (A) Mycelium extending from soil particles, and (B) spreading throughout

the agar. The mycelium were sub-cultured using a sterile toothpick, and gave rise to

various Streptomyces spp.

403

404 From 15 d incubation, blue-pigmented fruiting forms began developing on soil particles (Fig

405 3.10C). These were determined to be fungal following sub-culturing and microscopy with

406 Gram and lactophenol cotton blue staining. Over the culturing period, several filamentous

407 fungal morphotypes grew prevalently on WCX plates, particularly pretreated soils from MP,

408 RL and WT. This was despite the application of increased concentrations of cycloheximide

409 (Section 3.1.2.1). Visibility of bacterial microcolonies was thus reduced, leading to lower

410 numbers of colonies picked for sub-culturing.

411

105

Figure 3.10 Visible colonies were directly picked from soil cultures using a sterile toothpick and stereomicroscopy. (A) HI E. coli baiting plate with untreated soil; yellow pigmented colonies were Rhodococcus spp. (B) MP cellulose baiting plate with untreated soil; Streptomyces sp. M1 was recovered from the microcolonies seen here growing on the soil crumb surface. (C) HI cellulose baiting with pretreated soil. Several filamentous fungi grew on the WCX plates despite the addition of cycloheximide. Here, blue- pigmented conidia were visible on the surface of a soil crumb. (D) White sporulating colonies on the surface of rabbit dung pellets. Similar sub-cultured colonies from RL were

Streptomyces sp. 106

412 In total > 100 bacteria were isolated by DSC from all four sites, with 43 isolates determined

413 to be different at species level (Table 3.4). HI yielded the greatest number of species (28),

414 compared to MP (7), RL (4) and WT (4) (Fig. 3.15). The majority of isolates belonged to

415 Actinobacteria (32), the remaining were Alphaproteobacteria (6), Betaproteobacteria (3), and

416 one representative each from Bacteroidetes and Firmicutes phyla (Table 3.4).

417

418 Myxococcales fruiting bodies were not detected over the 8-month observation period, nor

419 were any Deltaproteobacteria recovered, which may reflect their low abundance in these soils

420 (0.2-0.7% rDNA gene relative abundance). Interestingly, Streptomyces was the most

421 abundant genus recovered by DSC, comprising 40% of all isolates (Fig. 3.11, Table 3.4).

422 Streptomyces were particularly abundant in HI, which was known to harbour high

423 Actinobacterial relative abundance via culture-independent methods (Fig. 3.3). Streptomyces

424 colony morphology was varied (Fig. 3.11), and sporulation was predominantly olive green

425 (Fig. 3.11B and F), white (Fig. 3.11A and C) or brown (Fig. 3.11D). One species, S.

426 lienomycini NBH81, produced a striking red colony pigmentation (Fig. 3.11E). Diffused

427 melanin-like pigments ranged from very dark brown, as in S. lavendulae NBH20 and S.

428 gougerotti NBH77 (Fig. 3.11A and D), to tan-coloured, for example S. flavogriseus NBH21

429 and S. parvus NBM1 (Fig. 3.11B and F). Species that produced no diffused melanin-like

430 pigments included S. atroolivaceus NBH70 and S. lienomycini NBH81 (Fig. 3.11C and E).

107

Table 3.4 Phylogenetic distribution of bacterial species cultured from all sites by DSC.

Strain Closest species match Phylum Sim. (%) Accession Bait Soil Trmt Days † No. NBM4 Arthrobacter koreensis Actinobacteria 99 KP715106.1 E P 41 1 NBM25 Burkholderia cepacia β-Proteobacteria 99 KT906686.1 E U 154 1 NBM12 Burkholderia sordidicola β-Proteobacteria 99 KJ606828.1 E, C U, P 96 4 NBH87 Frigoribacterium faeni Actinobacteria 99 KX809655.1 E U 134 1 NBWT11 Geodermatophilus soli Actinobacteria 98 NR_109440.1 ED U 146 1 NBWT1 Geodermatophilus terrae Actinobacteria 99 NR_109441.1 ED U 42 1 NBH84 Hymenobacter xinjiangensis Bacteroidetes 97 JF496493.1 E U 124 1 NBH82 Janibacter melonis Actinobacteria 99 KT720303.1 E U 124 1 NBRL9 Massilia timonae β-Proteobacteria 99 EU221406.1 ED U 42 1 NBH50 Methylobacterium populi α-Proteobacteria 100 KY882116.1 E P 39 2 NBRL2 Microbacterium aerolatum Actinobacteria 100 LN774527.1 ED U 27 1 NBWT6 Microbacterium foliorum Actinobacteria 100 KY405917.1 ED U 50 1 NBH49 Microbacterium schleiferi Actinobacteria 98 KY681786.1 E P 36 1 NBH85 Microbacterium testaceum Actinobacteria 100 KX809655.1 E P 151 1 NBH64 Micrococcus yunnanensis Actinobacteria 99 MH790299.1 E, C U, P 32 >5 NBM3 Micrococcus yunnanensis Actinobacteria 100 KX082873.1 E U 46 4 NBRL5 Micrococcus yunnanensis Actinobacteria 99 KT719527.1 ED U 27 1 NBM11 Novosphingobium subterraneum α-Proteobacteria 99 JF459977.1 E P 35 1 NBH48 Paracoccus carotinifaciens α-Proteobacteria 99 NR_024658.1 E U 29 1 NBM5 Planococcus plakortidis Firmicutes 99 LT160774.1 E P 46 1 NBH57 Pseudarthrobacter sulfonivorans Actinobacteria 99 KX056505.1 E U 25 1 NBH51 Rhodococcus fascians Actinobacteria 99 LN999546.1 E P 85 >2 NBH73 Rhodococcus luteus Actinobacteria 99 AJ576249.1 E U, P 52 >6 NBH83 Sphingomonas aerolata α-Proteobacteria 99 LN774415.1 E U 116 1 NBH67 Sphingomonas endophytica α-Proteobacteria 99 NR_117869.1 E P 59 1 NBWT7 Sphingomonas mucosissima α-Proteobacteria 99 JF496278.1 ED U 50 1 H: Herring Island, M: Mitchell Peninsula, RL: Rookery Lake, WT: Wilkes Tip, E: E.coli, C: cellulose, ED: E.coli & dung, P: pretreated, U: untreated, †: days from initial DSC set up to colony picking, No.: number of strains cultured.

108

431

Table 3.4 Phylogenetic distribution of bacterial species cultured from all sites by DSC cont.

Strain Closest species match Phylum Sim. (%) Accession Bait Soil Trmt Days† No. NBH70 Streptomyces atroolivaceus Actinobacteria 100 KX527679.1 E, C U, P 25 2 NBH41 Actinobacteria 100 KY007184.1 E, C U, P 38 >6 NBH53 Streptomyces californicus Actinobacteria 99 FJ481076.1 C P 36 1 NBH13 Streptomyces coelicoflavus Actinobacteria 100 KT758401.2 C P 32 2 NBH1 Streptomyces cyaneofuscatus Actinobacteria 99 KY514161.1 C P 14 1 NBH86 Streptomyces daghestanicus Actinobacteria 99 KX775313.1 E U 134 1 NBH21 Streptomyces flavogriseus Actinobacteria 99 KU324455.1 E P 36 1 NBH65 Actinobacteria 100 KU324456.1 E U 42 1 NBH77 Streptomyces gougerotii Actinobacteria 99 KT758400.1 C P 112 1 NBH78 Actinobacteria 99 FN298358.1 E P 112 1 NBH20 Streptomyces lavendulae Actinobacteria 99 KX698040.1 E P 34 >4 NBH81 Streptomyces lienomycini Actinobacteria 99 KY753328.1 C P 155 2 NBH61 Actinobacteria 100 MF359745.1 E U 24 >2 NBM1 Streptomyces parvus Actinobacteria 100 MF359745.1 E U 35 1 NBH42 Streptomyces praecox Actinobacteria 100 KX507060.1 E, C U, P 61 >5 NBH12 Actinobacteria 99 KU973960.1 C U, P 23 >7 NBRL4 Streptomyces pratensis Actinobacteria 100 KU973960.1 ED U 27 2 H: Herring Island, M: Mitchell Peninsula, RL: Rookery Lake, WT: Wilkes Tip, E: E.coli, C: cellulose, ED: E.coli & dung, P: pretreated, U: untreated, †: days from initial DSC set up to colony picking, No.: number of strains cultured.

432

109

433

Figure 3.11 Colony morphology for six different Streptomyces isolates. (A) S.

lavendulae NBH20 formed white sporulation, with a dark melanin-like pigmentation

which diffused into surrounding agar. (B) S. flavogriseus NBH21 produced olive

green/white sporulation with a tan-coloured diffused pigment. (C) For S. atroolivaceus

NBH70, sporulation was white and diffused pigments were absent. (D) S. gougerotii

NBH77 formed ringed brown/tan sporulating colonies with a dark melanin-like pigment.

(E) S. lienomycini NBH81 produced red colony pigmentation and no diffused pigments.

(F) S. parvus NBM1 were olive green/white sporulating colonies with a tan melanin-like

pigment.

434

110

435 For HI and MP DSC, the most successful baiting method was E. coli, which yielded the

436 greatest diversity (15 genera) and number (24) of isolates, compared to only three genera

437 from six isolates via cellulose baiting. Five isolates were recovered via both methods (Table

438 3.4). The pretreatment of soil with heat and sonication resulted in slightly more isolates (16)

439 than untreated soil (12), while seven isolates grew from both treatments. The diversity of

440 genera retrieved from each soil treatment was the same (10 genera each) (Table 3.4).

441

Figure 3.12 DSC isolates with ≤ 98% 16S rDNA gene sequence similarity to known

species. (A) NBWT11 shared 98% similarity to Geodermatophilus soli. (B) NBH49 was

98% similar to Microbacterium schleiferi and (C) NBH84 exhibited 97% sequence

similarity to Hymenobacter xinjiangensis.

442

443 Isolates were predominantly recovered from sub-cultures on 0.75x NA, rather than SEGG or

444 2nd round WCX bait media (Section 3.2.2.3). Exceptions were Janibacter melonis NBH82,

445 Sphingomonas aerolata NBH83, Hymenobacter xinjiangensis NBH84, Frigoribacterium

446 faeni NBH87 and S. parvus NBM1, which were retrieved through sub-culturing to 2nd round

447 WCX and E. coli. The median time taken from initial DSC set-up to visible colony formation

448 for the four soils ranged from 27-50 d (Table 3.4). Isolates which were particularly slow 111

449 growing or had lengthy lag-phases took > 100 d for adequate growth to appear, including two

450 potentially novel species: Geodermatophilus soli NBWT11 and H. xinjiangensis NBH84.

451 Additionally, several species recovered from the 2nd round WCX and E. coli media were also

452 slow growing: F. faeni NBH87, J. melonis NBH82 and Sphingomonas aerolata NBH83, and

453 several of the morphologically striking Streptomyces such as S. gougerotii NBH77 and S.

454 lienomycini NBH81 (Fig. 3.11, Table 3.4).

455

456 Many of the DSC isolates showed high 16S rDNA gene sequence identity to known bacterial

457 species (99-100%). Three exhibited sequence identities of 97-98%, indicating they may be

458 novel species. These were G. soli NBWT11, Microbacterium schleiferi NBH49 and H.

459 xinjiangensis NBH84 (Fig. 3.12).

460

461 3.6.2 Cold-temperature SSMS cultures

462 At 50 and 78 d incubation at 4°C, epi-fluorescent microscopy revealed small microcolonies

463 comprised of three or more small cocci or short rod-shaped cells < 1µM (Fig. 3.13A). After

464 162 d of incubation, larger microcolonies were observed, predominantly small cocci and

465 short rod-shaped cells (Fig. 3.13B). Energy limited cells are known to decrease their cell size,

466 and alter cell shape to coccoid morphology (Lever et al. 2015). A small number of larger rod-

467 shaped cells (6-8 µm) were also present at 162 d.

468

469 A total of 90 isolates were recovered from cold-temperature SSMS methods, with 10 of these

470 determined to be different at species level (Table 3.5). Isolates belonged to both

471 Actinobacteria and Proteobacteria phyla. Dominant genera were Rhodococcus,

472 Pseudoarthrobacter and Arthrobacter (Fig. 3.14), followed by Streptomyces (Table 3.5, Fig.

473

112

Figure 3.13 Cold-incubated SSMS microcolonies visualised using epi-fluorescence

microscopy. When stained with SYTO 9 and propidium iodide, live cells with

uncompromised cell membranes fluoresce green, while dead/damaged cells fluoresce red.

(A) At 50 d incubation only a few small microcolonies were observed. Cells were cocci

and short rods < 1µm in size. (B) Numerous live microcolonies were observed at 162 d

incubation. Small cocci and short rod-shaped cells < 1µm in size predominated.

474

475 3.14D). One isolate (NBSH29) was a potentially novel strain, sharing low 16S

476 rDNAsequence similarity to the closest known species Mesorhizobium olivaresii (98%)

477 (Table 3.5, Fig. 3.14F).

478

479 Enrichment in liquid RAVAN media (Fig. 3.7) led to the recovery of three species which

480 were present in low abundance. These were R. erythropolis NBSH38, M. olivaresii NBSH29,

481 and S. clavifer NBSH56 (Table 3.5), all of which were only recovered through one

482 enrichment round (Table 3.6). Two other low abundance species were only recovered without

483 enrichment: S. lucensis NBSH23 and Simplicispira psychrophila NBSH78 (Table 3.6).

484 113

485

Table 3.5 Phylogenetic distribution of isolates cultured from Herring Island by cold-

temperature SSMS

Herring cold-inc. Id Days Strain SSMS Phylum (%) Blast Acc. † No. NBSH28 Arthrobacter Actinobacteria 100 KR140255.1 277 >10 alpinus NBSH29 Mesorhizobium α-Proteobacteria 98 LN681548.1 294 3 olivaresii NBSH8 Pseudarthrobacter Actinobacteria 100 KX056505.1 231 >20 sulfonivorans NBSH38 Rhodococcus Actinobacteria 99 KU904404.1 254 1 erythropolis NBSH10 Rhodococcus Actinobacteria 99 LN999546.1 191 >14 yunnanensis NBSH90 Rhodococcus Actinobacteria 100 AJ576249.1 191 >20 luteus NBSH78 Simplicispira β-Proteobacteria 99 NR_113622.1 288 1 psychrophila NBSH56 Streptomyces Actinobacteria 99 KU324446.1 280 >1 clavifer NBSH44 Streptomyces Actinobacteria 100 KP718539.1 229 >5 finlayi NBSH23 Streptomyces Actinobacteria 99 KJ571105.1 147 >1 lucensis †: mean days from SSMS set-up to colony picking, No.: number of strains cultured.

486

114

Figure 3.14 Bacteria cultured from HI by the cold-incubated SSMS. (A) The RAVAN media spread-plated communities were dominated by three main morphotypes; large yellow, large white, and smaller yellow-orange colonies. (B) Large white colonies were

Pseudarthrobacter sulfonivorans (e.g. SH8). (C) Small yellow-orange colonies were

Rhodococcus spp. such as R. luteus (e.g. SH90). (D) Several Streptomyces spp. were recovered, the most abundant was S. finlayi (e.g. SH44). (E) Large yellow colonies were

Arthrobacter alpinus (e.g. SH28). (F) Mesorhizobium olivaresii SH29 exhibited 98% similarity to known species.

115

Table 3.6 SSMS followed by liquid media enrichment conditions for recovered

species

Liquid media

enrichment Strain Herring cold inc. SSMS No. 0 x1 x2 NBSH28 Arthrobacter alpinus >10 NBSH29 Mesorhizobium olivaresii 3 NBSH8 Pseudarthrobacter sulfonivorans >20 NBSH38 Rhodococcus erythropolis 1 NBSH10 Rhodococcus yunnanensis >14 NBSH90 Rhodococcus luteus >20 NBSH78 Simplicispira psychrophila 1 NBSH56 >1 NBSH44 Streptomyces finlayi >5 NBSH23 Streptomyces lucensis >1 * *: Growth on RAVAN TSV only as co-culture

487

488 Cold-incubated SSMS isolates were slow to grow, taking a median of 280 d from initial

489 SSMS set-up to visible colony formation (Table 3.5). Interestingly, the S. lucensis NBSH23

490 isolate grew on RAVAN/ TSV/ GG only in co-culture with other microorganisms (Table

491 3.6), suggesting helper strains were supplying nutritional requirements not provided by the

492 low-nutrient media alone. Pure colony isolation of this strain was only achieved through sub-

493 culture onto nutrient-rich 0.75x NA. All cold-adapted species cultured by the SSMS at 8°C

494 were also capable of growth at RT.

495

496 3.6.3 Summary of bacterial isolates cultured by DSC and SSMS

497 3.6.3.1 Total bacteria cultured by all methods across four sites

498 Overall, culturing from all sites resulted in a final library of 53 isolates, spanning 47 different

499 species. Actinobacteria were the dominant phylum, with 34 species, of which 32 belonged to

116

500 order Actinomycetales. This was followed by Alphaproteobacteria (7 spp.),

501 Betaproteobacteria (4 spp.), and one isolate each from Bacteroidetes and Firmicutes phyla.

502 Three species were recovered across multiple sites, suggesting they are endemic in these

503 regions. These were Micrococcus yunnanensis, found at all pristine sites (HI, RL and MP)

504 (Fig. 3.15), S. pratensis, recovered from HI and RL, and S. parvus, found at HI and MP. For

505 HI, only two species were recovered by both DSC and SSMS methods: Pseudarthrobacter

506 sulfonivorans and Rhodococcus luteus. The contaminated site, WT, did not share species

507 with any other sites (Fig. 3.15).

508

Figure 3.15 Cultured bacterial species recovered across four Antarctic soils by DSC

and the SSMS. In total, 47 species were cultured. Herring Island (HI) isolates recovered

via DSC and SSMS shared only 2 species: Pseudarthrobacter sulfonivorans and

Rhodococcus luteus. Pristine sites HI, RL and MP shared Micrococcus yunnanensis. HI

and RL DSC also shared Streptomyces pratensis, and HI and MP DSC also shared S.

parvus. The contaminated site, WT, did not share any species with the pristine sites.

117

509 3.6.3.2 Bacterial colony pigmentation

510 Approximately half (23) of all bacterial isolates recovered from eastern Antarctica displayed

511 carotenoid-like pigmentation (Fig. 3.16), which varied from pale yellow through orange and

512 red. Pigmented bacteria spanned all 4 phyla.

513

Figure 3.16 Carotenoid-like pigmentation was observed in half of all cultured

isolates. Here, a selection of species is shown, displaying a range of pigmentation.

514

515 3.6.4 Natural product domain amplification and in situ antimicrobial activity for

516 selected isolates

517 3.6.4.1 Strains isolated in this study

518 All 53 isolates cultured in this study were analysed for NP domains and bioactivity against

519 five pathogens. Eighteen were positive for Type I PKS KS/AT domains, and 23 were positive

520 for NRPS AD domains (Table 3.7). Of these, 10 Actinobacteria and one Betaproteobacterium 118

521 were positive for both NP domains (Table 3.7). In the cross-streak antimicrobial assay, 15

522 strains showed measurable activity against the Gram-positive pathogens S. aureus and B.

523 subtilis, three against the Gram-negative pathogens E. coli and P. aeruginosa, and four

524 against the yeast C. albicans (Table 3.7). Streptomyces was the only genus that displayed

525 measurable activity, although some inhibition of pathogen growth was evident for

526 Paracoccus, Pseudarthrobacter, Rhodococcus, Novosphingobium and Sphingomonas

527 species. The pathogen most commonly inhibited was B. subtilis (21 isolates) (Table 3.7, Fig.

528 3.17), followed by S. aureus (18 isolates), then C. albicans (11 isolates). S. lavendulae isolate

529 NBH20 displayed the greatest activity, via inhibition of all five pathogens.

530

531

Figure 3.17 Cross-streak antimicrobial assay for bacterial isolates. Cold-incubated

SSMS grown isolate NBSH44 showed measurable activity against Gram-positive

pathogens, Bacillus subtilis and Staphylococcus aureus, and some inhibition of Gram-

negative E. coli.

119

Table 3.7 Natural product domain amplification and in situ antimicrobial activity for strains isolated in this study.

PCR Mean Antimicrobial Cross-Streak Activity (mm) Strain Closest cultured representative Phylum PKS NRPS S. aur. B.subt. E.coli P.aerug. C.albic. NBSH28 Arthrobacter alpinus Actinobacteria - - 0 0 0 0 0 NBM4 Arthrobacter koreensis Actinobacteria - - 0 0 0 0 0 NBM25 Burkholderia cepacia β-Proteobacteria + + 0 0 0 0 0 NBM12 Burkholderia sordidicola β-Proteobacteria - + 0 0 0 0 0 NBH87 Frigoribacterium faeni Actinobacteria - + 0 0 0 0 0 NBWT11 Geodermatophilus soli Actinobacteria - - 0 0 0 0 0 NBWT1 Geodermatophilus terrae Actinobacteria - - 0 0 0 0 0 NBH84 Hymenobacter xinjiangensis Bacteroidetes - - 0 0 0 0 0 NBH82 Janibacter melonis Actinobacteria - - 0 0 0 0 0 NBRL9 Massilia timonae β-Proteobacteria - - 0 0 0 0 0 NBSH29 Mesorhizobium olivaresii α-Proteobacteria - - 0 0 0 0 0 NBH50 Methylobacterium populi α-Proteobacteria - - 0 0 0 0 0 NBRL2 Microbacterium aerolatum Actinobacteria - - 0 0 0 0 0 NBWT6 Microbacterium foliorum Actinobacteria - - 0 0 0 0 0 NBH49 Microbacterium schleiferi Actinobacteria - - 0 0 0 0 0 NBH85 Microbacterium testaceum Actinobacteria + - 0 0 0 0 0 NBH64 Micrococcus yunnanensis Actinobacteria - - 0 0 0 0 0 NBM3 Micrococcus yunnanensis Actinobacteria - - 0 0 0 0 0 NBRL5 Micrococcus yunnanensis Actinobacteria - - 0 0 0 0 0 NBM11 Novosphingobium subterraneum α-Proteobacteria - - † † 0 0 † NBH48 Paracoccus carotinifaciens α-Proteobacteria - - † 0 0 0 0 NBM5 Planococcus plakortidis Firmicutes - - 0 0 0 0 0 NBH57 Pseudarthrobacter sulfonivorans Actinobacteria - - 0 0 0 0 0 NBSH8 Pseudarthrobacter sulfonivorans Actinobacteria - - † 0 0 0 † NBSH38 Rhodococcus erythropolis Actinobacteria - + 0 0 0 0 0 NBH51 Rhodococcus fascians Actinobacteria + + 0 0 0 0 0 NBSH10 Rhodococcus yunnanensis Actinobacteria + + 0 0 0 0 0 †: Some inhibition of pathogen growth 532

120

Table 3.7 Natural product domain amplification and in situ antimicrobial activity for strains isolated in this study cont.

PCR Mean Antimicrobial Cross-Streak Activity (mm) Strain Closest cultured representative Phylum PKS NRPS S. aur. B.subt. E.coli P.aerug. C.albic. NBH73 Rhodococcus luteus Actinobacteria + + 0 0 0 0 † NBSH90 Rhodococcus luteus Actinobacteria + + 0 0 0 0 0 NBSH78 Simplicispira psychrophila β-Proteobacteria + - 0 0 0 0 0 NBH83 Sphingomonas aerolata α-Proteobacteria + - 0 0 0 0 0 NBH67 Sphingomonas endophytica α-Proteobacteria - - 0 0 0 0 0 NBWT7 Sphingomonas mucosissima α-Proteobacteria - - 0 † 0 0 † NBH70 Streptomyces atroolivaceus Actinobacteria - + 5 3 † 0 † NBH41 Streptomyces badius Actinobacteria - + † † 0 0 0 NBH53 Streptomyces californicus Actinobacteria - + † 8 † 0 † NBSH56 Streptomyces clavifer Actinobacteria + - † † 0 0 0 NBH13 Streptomyces coelicoflavus Actinobacteria - + 3 1 0 1 0 NBH1 Streptomyces cyaneofuscatus Actinobacteria - + 8 8 † † 9 NBH86 Streptomyces daghestanicus Actinobacteria + + 1 3 0 0 0 NBSH44 Streptomyces finlayi Actinobacteria + + 1 8 † 0 0 NBH21 Streptomyces flavogriseus Actinobacteria - - 0 4 0 0 0 NBH65 Streptomyces globosus Actinobacteria - + 0 3 0 0 0 NBH77 Streptomyces gougerotii Actinobacteria + - 5 3 0 0 2 NBH78 Streptomyces griseus Actinobacteria + + 0 † 0 0 0 NBH20 Streptomyces lavendulae Actinobacteria - + 11 12 2 † 6 NBH81 Streptomyces lienomycini Actinobacteria + + 8 12 0 0 0 NBSH23 Streptomyces lucensis Actinobacteria - + 2 5 0 † 0 NBH61 Streptomyces parvus Actinobacteria + - 0 † 0 0 0 NBM1 Streptomyces parvus Actinobacteria + - 1 1 3 0 2 NBH42 Streptomyces praecox Actinobacteria - + 8 5 † † † NBH12 Streptomyces pratensis Actinobacteria + + 2 0 0 0 0 NBRL4 Streptomyces pratensis Actinobacteria + + 0 † 0 0 0 †: Some inhibition of pathogen growth 533

121

534 3.6.4.2 Strains isolated from previous studies

535 Of the 20 strains which had been previously isolated by colleagues, six were positive for

536 Type I PKS domains, and 15 were positive for NRPS domains (Table 3.8). Of these, six

537 Actinobacteria were positive for both of the NP domains (Table 3.8). In the cross-streak

538 antimicrobial assay, three strains showed measurable activity against Gram-positive

539 pathogens, two had activity against Gram-negative pathogens, and two against the yeast

540 C. albicans (Table 3.8). Here, two non-Streptomyces spp., Frondihabitans australicus

541 INR15 and Mesorhizobium qingshengii INR17, displayed measurable activity. Overall,

542 S. spororaveus INR7 was the most exciting Antarctic bacterium in terms of antimicrobial

543 activity, displaying considerable inhibition of Gram-negative pathogens E. coli and P.

544 aeruginosa, as well as S. aureus and C. albicans (Table 3.8).

122

Table 3.8 Natural product domain amplification and in-situ antimicrobial activity for isolates from previous studies.

PCR Mean Antimicrobial Cross-Streak Activity (mm) Strain Closest cultured representative Phylum PKS NRPS S. aur. B.subt. E.coli P.aerug. C.albic. INR13 Azospirillum zeae α-Proteobacteria - + 0 0 0 0 0 INR4 Bacillus aryabhattai Firmicutes - - 0 0 0 0 † INR6 Burkholderia jiangsuensis β-Proteobacteria - - 0 0 0 0 0 INWT7 Cryobacterium mesophilum Actinobacteria - + 0 0 0 0 0 INR15 Frondihabitans australicus Actinobacteria - + 6 3 † 0 0 SPB151 Kribbella sandramycini Actinobacteria - + 0 0 0 0 0 INR9 Leifsonia shinshuensis Actinobacteria + + 0 0 0 0 0 INR17 Mesorhizobium qingshengii α-Proteobacteria - + 0 1 0 0 0 INWT5 Methylobacterium brachiatum α-Proteobacteria - + 0 0 0 0 0 SPB164 Mycobacterium fluoranthenivorans Actinobacteria + + 0 0 0 0 0 SPB16 Paenisporosarcina macmurdoensis Firmicutes - - 0 0 0 0 0 INWT6 Quadrisphaera granulorum Actinobacteria - + 0 0 0 0 0 INWT3 Rhodococcus aerolatus Actinobacteria - - 0 0 0 0 0 SPB1 Rhodococcus yunnanensis Actinobacteria - + 0 0 0 nt 0 SPB167 Streptomyces abikoensis Actinobacteria - - † † 0 nt 0 SPB35 Streptomyces beijiangensis Actinobacteria + + † † † nt † SPB162 Streptomyces fildesensis Actinobacteria + + † † 0 nt † SPB13 Streptomyces indigoferus Actinobacteria - + 0 0 0 nt 0 SPB4 Streptomyces lavendulae Actinobacteria + + 0 † 1 nt 12 INR7 Streptomyces spororaveus Actinobacteria + + 13 † 16 5 15 †: Some inhibition of pathogen growth, nt: not tested 545

123

546 3.6.5 Selection of isolates for whole genome sequencing

547 Using the selection criteria outlined in Section 3.2.13, eighteen isolates were chosen to

548 undergo WGS. The primary reasons for the selection of each isolate were highlighted in

549 Table 3.9. Three Streptomyces species were prioritised due to the highest measurable

550 antimicrobial activity, in addition to this genus's well-established value in NP discovery.

551 Selected Streptomyces isolates were the S. spororaveus INR7, which showed the greatest

552 antimicrobial activity overall, including against Gram-negative pathogens (Table 3.9); S.

553 finlayi NBSH44, which was uniquely cultured through cold-incubated SSMS methods

554 and displayed activity against Gram-positive pathogens; and S. gougerotii NBH77, which

555 inhibited Gram-positive bacteria and the yeast C. albicans (Table 3.9). The S. finlayi and

556 S. gougerotii strains have no high-quality genomes in the GTDB database. As

557 Streptomyces are known to be difficult to differentiate by 16S gene sequence similarity

558 alone (Cheng et al. 2016, Labeda et al. 2017), the selected Streptomyces spp. were

559 morphologically different, and known to belong to phylogenetically distinct clades to

560 avoid sequencing two closely-related strains (Cheng et al. 2016, Labeda et al. 2017). Of

561 the non-Streptomyces species selected, 12 were Actinobacteria: Kribbella sandramycini

562 SPB151, Cryobacterium mesophilum INWT7, Frigoribacterium faeni NBH87,

563 Frondihabitans australicus INR15, Geodermatophilus NBWT11, Leifsonia shinshuensis

564 INR9, Pseudarthrobacter sulfonivorans NBSH8, Quadrisphaera granulorum INWT6

565 and Rhodococcus luteus NBSH90. Five were Alphaproteobacteria: Mesorhizobium sp.

566 nov NBSH29, Azospirillum zeae INR13, Novosphingobium subterraneum NBM11,

567 Paracoccus carotinifaciens NBH48, Sphingomonas mucosissima NBWT7; and one was

568 a Bacteroidetes, Hymenobacter sp. nov NBH84 (Table 3.9). Nine of these genera were of

569 additional biotechnological interest, including Rhodococcus, Sphingomonas,

570 Novosphingobium, Paracoccus, Azospirillum, Geodermatophilus and Pseudarthrobacter

124

Table 3.9 Characteristics used to select eighteen strains for whole genome sequencing.

PKS/NRPS

AB activity

GTDB

Other

Strain Closest cultured representative Site Phylum ID (%) INR13 Azospirillum zeae RR α-Proteobacteria 100 - - -+ + INWT7 Cryobacterium mesophilum WT Actinobacteria 99 - - -+ NBH87 Frigoribacterium faeni HI Actinobacteria 99 - - -+ INR15 Frondihabitans australicus RR Actinobacteria 99 - ++ -+ NBWT11 Geodermatophilus soli WT Actinobacteria 98 - - -- + NBH84 Hymenobacter xinjiangensis HI Bacteroidetes 97 - - -- + SPB151 Kribbella sandramycini BP Actinobacteria 99 - - -+ INR9 Leifsonia shinshuensis RR Actinobacteria 99 - - ++ NBSH29 Mesorhizobium olivaresii HI α-Proteobacteria 98 - - -- + NBM11 Novosphingobium subterraneum MP α-Proteobacteria 99 + † -- + NBH48 Paracoccus carotinifaciens HI α-Proteobacteria 99 - † -- + NBSH8 Pseudarthrobacter sulfonivorans HI Actinobacteria 100 + † -- + INWT6 Quadrisphaera granulorum WT Actinobacteria 99 - - -+ NBSH90 Rhodococcus luteus HI Actinobacteria 100 + - ++ + NBWT7 Sphingomonas mucosissima WT α-Proteobacteria 99 - † -- + NBSH44 Streptomyces finlayi HI Actinobacteria 100 - ++ ++ NBH77 Streptomyces gougerotii HI Actinobacteria 99 - ++ +- INR7 Streptomyces spororaveus RR Actinobacteria 99 + ++++ ++ Primary reasons for selection for each isolate are highlighted in red. †: Some inhibition of pathogens. 571

125

572 which are known hydrocarbon degraders of interest in bioremediation (Brooijmans et al.

573 2009).

574

575 3.7 DISCUSSION

576 Antarctic desert soil Actinomycetales and Myxococcales were targeted here using two novel

577 culturing techniques, DSC and SSMS. While no Myxococcales were recovered, the methods

578 were successful in capturing a total library of 47 Antarctic bacterial species (Tables 3.4 &

579 3.5), spanning 19 genera across four phyla, and included 32 different Actinomycetales

580 species. The goal to target Myxococcales was optimistic, as they are known to be

581 predominantly mesophilic, with only four psychrophilic Myxococcales previously recorded

582 (Ruckert 1985, Shimkets et al. 2006, Brockman & Boyd 1963, Dawid et al. 1988), and

583 molecular studies indicated low relative abundance of Deltaproteobacteria in these soils

584 (< 0.7%). Nevertheless, DSC was an effective, unconventional culturing technique for the

585 capture of other Antarctic genera, particularly Streptomyces. This was most evident for HI,

586 which exhibited high 16S rDNA gene relative abundance of Actinobacteria (67%) compared

587 with the other DSC soils (< 33%) (Fig. 3.3). Sporulating Streptomyces microcolonies were

588 visualised atop soil crumbs and picked by stereomicroscopy (Figs. 3.9, 3.10). To my

589 knowledge, this is the first report of DSC for recovery of Streptomyces. Fungal overgrowth

590 was problematic during DSC, with several filamentous fungi unaffected by the antifungal

591 cycloheximide. For future DSC, a combination of antifungals, such as cycloheximide plus

592 nystatin would be advisable (Karwowski et al. 1996).

593

594 The soil substrate membrane system (SSMS) was employed here for the first time under

595 psychrophilic incubation conditions. Lower diversity of bacteria was obtained from SSMS 126

596 in comparison to DSC for the same HI soil (Tables 3.4, 3.5). However, the majority of species

597 isolated from SSMS were not recovered by DSC (Fig. 3.14, Tables 3.4, 3.5). The exceptions,

598 Pseudarthrobacter and Rhodococcus, were highly abundant in SSMS and were well-adapted

599 to growth at ≤ 8°C and 21°C. All isolates retrieved by cold-incubated SSMS were capable of

600 growth at RT. Previous studies have similarly noted a tendency toward psychrotrophy rather

601 than psychrophily in terrestrial Antarctic microorganisms, which has been attributed to their

602 need to endure regular freeze-thaw cycles (Morita 1975, De Maayer et al. 2014, Soina et al.

603 2004). Temperatures up to +18°C have been recorded in southern Antarctic surface soils,

604 with large daily fluctuations (~10°C) during summer, correlated with the proportion of

605 incoming solar radiation (Aislabie et al. 2004, Balks et al. 2002).

606

607 Extended incubation times assist in recovery of rare, oligotrophic taxa, especially those from

608 nutrient poor environments, and soils where the communities may be largely dormant, such

609 as in Antarctica (Pulschen et al. 2017, Alain & Querellou 2009, Davis et al. 2005). Here,

610 extended culture times (> 100 d) led to the recovery of several strains likely to be novel. For

611 example, Hymenobacter NBH84 (Table 3.4), Geodermatophilus NBWT11 and

612 Mesorhizobium NBSH29 (97-98% identity to known species) (Tables 3.4, 3.5). Additionally,

613 morphologically distinct Streptomyces; S. gougerotii, S. lienomycini (Fig. 3.11), were also

614 recovered after extended incubation times (Table 3.4). The expression of pigments in bacteria

615 commonly coincides with nutrient deprivation (Couso et al. 2012, Liu et al. 2013); thus,

616 lengthy culturing times may have assisted in visible differentiation of some of these isolates.

617 Other slow-growers included rarely-cultured Actinomycetales genera, Frigoribacterium and

618 Janibacter (Tiwari & Gupta 2013) (Tables 3.4). Only four species of Frigoribacterium have

619 been previously reported, with the first being a psychrotroph, isolated from airborne dust

127

620 (Kampfer et al. 2000, Kong 2016). Members of the Janibacter genus are also rare, with only

621 10 species described thus far (Maaloum et al. 2019).

622

623 Carotenoid-like pigmentation has been reported to be widespread in cold-adapted

624 microorganisms, and this was similarly found here with 23 yellow to red pigmented strains

625 (Fig. 3.16) (Baraúna et al., 2017; De Maayer et al., 2014; Koblížek & Brussaard, 2015;

626 Peeters et al., 2011). Carotenoids are most commonly associated with protection from UV

627 radiation via the scavenging of free radicals such as singlet oxygen (Walter & Strack 2011,

628 Maresca et al. 2008), but they are also hypothesised to assist with homeoviscous adaptation,

629 playing a regulatory role in membrane fluidity (Chattopadhyay & Jagannadham 2001, Walter

630 & Strack 2011). Furthermore, carotenoids function as accessory light-harvesting pigments in

631 aerobic anoxygenic phototrophs (AAP), assisting in bacteriochlorophyll-mediated

632 photosynthesis (Tahon & Willems 2017, Imhoff et al. 2018, Koblížek & Brussaard 2015).

633 AAP comprise certain members of Alpha- Beta- and Gammaproteobacteria, and include a

634 number of genera which are commonly recovered from polar soils, and which were also

635 isolated in this chapter; the Methylobacterium and Sphingomonas (Makhalanyane et al.

636 2015a, Tahon & Willems 2017, Walter & Strack 2011, Imhoff et al. 2018). Phototrophy may

637 thus be an important survival strategy for AAP Proteobacterial members in Antarctic desert

638 soils.

639

640 The contaminated site, Wilkes Tip, was the only soil not to share species with other samples

641 (Fig. 3.15). Of only four isolates recovered from Wilkes Tip, two were Geodermatophilus

642 species, one of which is novel (Table 3.4). Family Geodermatophilaceae are predominantly

643 associated with soil and rock surfaces in desert and polar regions (Normand 2006).

128

644 Geodermatophilus have a known tolerance for harsh environmental conditions such as UV

645 and ionizing radiation, desiccation, high salinity; and of particular interest, petroleum and

646 heavy metals contamination (Sghaier et al. 2015, Wang et al. 2017, Montero-Calasanz et al.

647 2013). It is likely that soil microbial diversity has been affected by contamination at the WT

648 site (Fig. 3.2D). Significant reductions in species richness and diversity have been previously

649 reported in soils of increasing fuel contamination, along with a corresponding enrichment of

650 hydrocarbon degrading taxa (van Dorst et al. 2016, Aislabie et al. 2004). All of the species

651 isolated from WT in this study were known hydrocarbon degraders, and therefore may prove

652 useful for bioremediation in low temperature environments (Haritash & Kaushik 2009,

653 Andreoni et al. 2004, Hassanshahian et al. 2012, Brooijmans et al. 2009).

654

655 Here, the greatest antimicrobial potential was found in Streptomyces spp., which displayed

656 the strongest inhibition of pathogen growth (Table 3.7 & 3.8). Two other genera,

657 Frondihabitans and Mesorhizobium, were interesting as they produced measurable activity

658 against Gram-positive pathogens (Table 3.8). Previously, a marine Mesorhizobium sp. has

659 been reported to produce a homoserine lactone compound with antibacterial activity against

660 B. subtilis, in addition to cytotoxic activity against tumour cell lines (Krick et al. 2007).

661 Antimicrobial activity has not been previously described for Frondihabitans.

662

663 To conclude, the culturing outcomes from this chapter have led to the selection of eighteen

664 isolates, including three Streptomyces spp., for whole genome sequencing in Chapter 4. The

665 chosen isolates spanned 16 genera and originated from four pristine Antarctic desert soils;

666 Herring Island (HI), Mitchell Peninsula (MP), Robinson Ridge (RR) and Browning Peninsula

667 (BP); as well as the contaminated site, Wilkes Tip (WT) (Table 3.9). Of the Streptomyces

129

668 isolates, the most exciting strain was S. spororaveus INR7, which displayed activity against

669 the Gram-negative pathogens E. coli and P. aeruginosa which are of particular concern in

670 terms of AMR (WHO, 2014) (Section 1.1).

130

CHAPTER FOUR

4 ANTARCTIC BACTERIAL GENOMES HARBOUR A

WEALTH OF UNCHARACTERISED BIOSYNTHETIC

GENE CLUSTERS

1 4.1 INTRODUCTION

2 More than twenty years have passed since the first completely sequenced bacterial genome

3 (Fleischmann et al. 1995). In that time, rapid advances in HTS and bioinformatics

4 technologies have led to spectacular growth in the number of available genome assemblies

5 (Schmid et al. 2018, Levy & Myers 2016). Presently, over 190,000 bacterial genomes reside

6 in the NCBI Genome database. Around 5,600 of these are awarded with reference and

7 representative genome status, curated to indicate high-quality assemblies (NCBI, 2019).

8 However, the majority of publicly available genomes remain in a draft state of varying

9 contiguity, completion and correctness (Schmid et al. 2018, Koren et al. 2013, Studholme

10 2016). This has implications for the accuracy of downstream analyses and our understanding

11 of microbial processes. For example, Daniel-Ivad et al. (2017) recently reported the complete

12 sequence of a cryptic BGC, despite the source genome assembly (Streptomyces

13 GCA_001974775) containing an estimated 82% contamination from DNA from a different

14 microorganism (http://gtdb.ecogenomic.org/genomes?gid=GCA_001974775.1).

15

16 In this chapter we aimed to achieve high-quality genome assemblies for the 18 bacterial

17 isolates selected for genome sequencing in Chapter 3 (Table 3.9). Thus far, very few

131

18 complete genomes have been reported for bacterial isolates from eastern Antarctica. They

19 include a Firmicutes genus, Carnobacterium, recovered from seawater (Zhu et al. 2016); a

20 Proteobacterial genus, Glaciecola, isolated from sea ice (Qin 2014, Bowman et al. 1998);

21 and one terrestrial Actinobacterial genus, Nesterenkonia, from the McMurdo Dry Valleys

22 (Aliyu et al. 2016). Primarily, the goal of this chapter was in the characterisation of BGCs

23 harboured by each isolate, including evaluation of BGC novelty, and, where possible,

24 prediction of encoded compounds. We selected the long-read platform PacBio RS II in order

25 to optimise capture of BGCs, which usually span long, repetitive, high G+C regions that are

26 difficult to resolve with SGS platforms (Section 1.8) (Nakano et al. 2017, Gomez-Escribano

27 et al. 2016). It was hypothesised that the resolution provided by long reads would allow for

28 the sequencing of multiple genomes from one sequencing library. We proposed that

29 differences in isolates at genus level would be distinct enough to allow for adequate

30 separation of species during genome assembly.

31

32 4.2 MATERIALS AND METHODS

33 4.2.1 High molecular weight genomic DNA extractions

34 The PacBio RS II sequencing platform requires large quantity input of high quality, high

35 molecular weight genomic DNA (Ramaciotti Centre for Genomics 2015). Thus, steps were

36 taken throughout extraction procedures to minimise DNA damage and fragmentation. Vortex

37 mixing, heating, and freeze/thaw cycles were avoided, and large bore pipette tips were

38 employed throughout. DNA extraction methods varied slightly depending on isolate genera

39 (Table 4.1).

40

132

Table 4.1 Genomic DNA extraction methods for Antarctic bacteria.

Isolate DNA extraction method Reference Streptomyces NBSH44 Streptomyces INR7 Kirby method Keiser 2000 Streptomyces NBH77 Kribbella SPB151 Mesorhizobium NBSH29 Hymenobacter NBH84 Geodermatophilaceae NBWT11 Sphingomonas NBWT7 Quadrisphaera INWT6 Pseudarthrobacter NBSH8 Frigoribacterium NBH87 Phenol-chloroform Rusch et al., 2007; Cryobacterium INWT7 method Yau et al., 2013 Rhodococcus NBSH90 Frondihabitans INR15 Paracoccus NBH48 Leifsonia INR9 Novosphingobium NBM11 Azospirillum INR13 Isolates in bold required an additional chloroform extraction 41

42 4.2.1.1 Spore harvesting for Streptomyces and Kribbella isolates

43 Triplicate spore stocks were harvested for Streptomyces strains NBSH44, INR7 and NBH77,

44 and the Kribbella isolate SPB151 using methods described by Kieser et al. (2000) and

45 Shephard et al. (2010). From pure, sporulating 7-day old cultures, grown on ISP4 agar

46 (Appendix A3.1), spores from a single colony were extracted using a 1 µL sterile loop and

47 spread in a cross-hatch format to cover an entire fresh ISP4 agar plate. Plates were incubated

48 at RT until a sporulating lawn was well-developed (~14 days). Sterile water (3 mL) was

49 added to the lawn and the hydrophobic spores dislodged and suspended using a sterile

50 spreader. Once suspended, water and spores were aseptically transferred to a 50 mL screw

133

51 top falcon tube (Corning) and water was added up to 30 mL. Spore chains were disrupted by

52 vigorous vortexing until a homogenous mixture was obtained (~10 min). To remove debris,

53 samples were filtered through a 10 mL syringe plugged with sterile cotton wool. Spore

54 suspensions were centrifuged 2,000 x g for 10 min, supernatants were immediately removed,

55 and spores re-suspended in 20% glycerol (1 mL) and stored at -80°C until further use.

56

57 4.2.1.2 Modified Kirby method for Streptomyces and Kribbella genomic DNA extraction

58 For Streptomyces spp., and the morphologically similar Kribbella isolate, an adapted Kirby

59 mix method, reported to retrieve genomic DNA ~40 kb in length (Kieser et al. 2000), was

60 used for genomic DNA extraction (Table 4.1). Triplicate 25 mL cultures, comprising nutrient

61 broth (Oxoid) inoculated with 100 uL spore stock, were incubated at RT in 250 mL baffled

62 Erlenmeyer flasks (Corning, Victoria, Australia) using an orbital shaker (Ratek, Victoria,

63 Australia) (200 rpm), and harvested at late-exponential phase. To confirm purity of cultures

64 at the time of extraction, subcultures were prepared onto NA plates and examined after two

65 days incubation at RT. Harvested cultures were centrifuged in 50 mL falcon tubes at 500 x g

66 for 10 min. Supernatants were removed and mycelium washed with 10% sucrose solution.

67 Centrifugation was repeated, supernatants removed, and the wet mycelium re-suspended in

68 3 mL TE25S buffer (25 mM Tris-HCl pH 8, 25 mM EDTA pH 8, 0.3 M sucrose).

69

70 For DNA extraction, a 100 µL aliquot of lysozyme (60 mg/mL) (Sigma-Aldrich) was added

71 to enable digestion of the Gram-positive cell walls. Tubes were incubated at 37°C for 10 min.

72 This was followed by the addition of 4 mL of 2 x Kirby Mix (2 g sodium dodecyl sulfate

73 (SDS), 12 g sodium 4-aminosalicilate, 5 mL 2M Tris-HCl pH 8, 6 mL buffered phenol pH 8,

74 made up to 100 mL with sterile water) with gentle agitation at RT for 3 min, followed by the 134

75 addition of 8 mL of phenol:chloroform:isoamyl alcohol mixture (PCI) (25:24:1) at pH 8

76 (Sigma-Aldrich), with gentle agitation for 15 s. Emulsions were centrifuged for 10 min at

77 1,500 x g to allow separation of the organic and aqueous phases. The upper aqueous phase,

78 which contained the polar nucleic acids, was transferred to a clean 50 mL falcon tube. The

79 extraction was repeated with the addition of 3 mL of PCI and 600 µL of 3 M unbuffered

80 sodium acetate (Sigma-Aldrich), and gently agitated 15 s. Phases were separated with

81 centrifugation for 10 min at 1,500 x g, and the aqueous phase transferred to a clean 50 mL

82 falcon tube. DNA was precipitated by addition of 0.6 volume of isopropanol (Ajax

83 FineChem), followed by centrifugation at 6,842 x g for 30 min. Supernatants were discarded,

84 and the DNA pellet re-suspended in minimal isopropanol and transferred to 2 mL

85 microcentrifuge tubes for centrifugation at 18,506 x g for 10 min. Supernatants were removed

86 and the DNA pellet was washed with 600 µL of 70% ethanol, centrifugation was repeated

87 and supernatants removed, and the pellet allowed to dry at RT. To hydrolyse RNA, pellets

88 were re-dissolved in 50 µL of Tris-EDTA (TE buffer) at pH 8 (Sigma-Aldrich), and 2 µL of

89 pre-boiled RNase A solution (4mg/mL) (Life Technologies, Australia) added for incubation

90 at 37°C for 10 min. DNA was re-extracted from the sample with an equal volume of PCI and

91 centrifuged at 18,506 x g for 5 min. The aqueous phase was mixed with 1/10 volume of 3M

92 sodium acetate pH 8, and 0.6 volume of isopropanol to precipitate the DNA, and centrifuged

93 at 18,506 x g for 5 min. DNA pellets were washed with 100 µL of 70% ethanol, re-

94 centrifuged, dried at RT and re-suspended in 100 µL elution buffer (EB) (Qiagen) and stored

95 at 4°C.

96

135

97 4.2.1.3 Phenol-chloroform genomic DNA extraction for other genera

98 For all other genera (Table 4.1), a modified phenol-chloroform DNA extraction method was

99 used (Rusch et al. 2007, Yau et al. 2013). Nutrient broth was prepared (25 mL) in 250 mL

100 baffled flasks and single colonies inoculated for incubation overnight at RT. Triplicate 250

101 mL Erlenmeyer flasks with 25 mL nutrient broth were inoculated with overnight cultures to

102 an OD600 of 0.005. Cultures were incubated at RT using an orbital shaker (200 rpm), and

103 cells were harvested at late-exponential phase. Purity of cultures at the time of extraction was

104 confirmed as previously described (Section 4.2.1.2). Harvested cultures were centrifuged at

105 500 x g for 10 min, supernatants discarded, and pellets resuspended in 10 mL sterile water.

106 To digest proteins, a 1/20 volume of TE buffer (pH 8), and 100 µL Proteinase K (20 mg/ mL,

107 Sigma-Aldrich) were added to samples with inversion, followed by 1 mL of 10% SDS to

108 enable cell lysis. Samples were incubated at 55°C for 2 hrs in a water bath with shaking at

109 175 rpm. To separate nucleic acids from protein and lipid cellular components, an equal

110 volume of buffered phenol (pH 8, Sigma-Aldrich), was added with inversion to mix.

111 Emulsions were centrifuged 1,622 x g 15 min 25°C to separate the phases, and the top

112 aqueous layer which contained the DNA was retained.

113

114 At this point, several DNA extracts (isolates NBM11, INR15, NBH48, INR13 and INR9)

115 proved difficult to separate from the organic phase (Table 4.1), possibly due to higher

116 concentrations of hydrophobic polymers such as lipids, carbohydrates or excess proteins. To

117 allow further separation of the phases these extracts were subjected to an additional

118 chloroform extraction (Psifidi et al. 2015), due to chloroform's higher density. An equal

119 volume of chloroform (AnalaR, Merck) was added with gentle mixing. Samples were

120 centrifuged at 1,622 x g for 15 min, and the aqueous layer retained. For all DNA extracts, an 136

121 equal volume of isopropanol was then added and mixed by inversion. DNA was precipitated

122 overnight at 4°C, then centrifuged at 6,842 x g for 30 min at 20°C, and the isopropanol

123 removed. The resulting pellet was re-suspended in minimal isopropanol and transferred to a

124 2 mL microcentrifuge tube, centrifuged at 18,506 x g RT for 10 min, and isopropanol

125 discarded. The resulting pellet was air dried at RT, then re-suspended in 50 µL TE buffer pH

126 8 and incubated at 4°C for 1 hr.

127

128 To remove RNA from DNA extracts, an aliquot (2 µL) of RNase A (4 mg/ mL) was added

129 and incubated at 37°C for 10 min. The volume was then increased to 700 µL using TE buffer

130 pH 8. DNA extraction was repeated by addition of an equal amount of PCI, and the phases

131 mixed with inversion. Phases were separated by centrifugation at 18,506 x g RT for 5 min.

132 The upper aqueous phase was retained, and DNA precipitated using a 1/10 volume of 3 M

133 sodium acetate (pH 8), and an equal volume of isopropanol. DNA was pelleted by

134 centrifugation at 18,506 x g at RT for 30 min, and supernatants removed. Pellet was washed

135 with 100 µL 70% ethanol, and re-centrifuged for 10 min. Ethanol was removed and the DNA

136 pellet air dried at RT, followed by resuspension in 100 µL EB. Purified DNA was stored at

137 4°C until quantified and pooled for sequence library preparation.

138

139 4.2.1.4 Quantification and quality assessment of genomic DNA

140 Genomic DNA was quantified using the Quant-iT Picogreen dsDNA Assay kit (Life

141 Technologies). Following quantification, the Rhodococcus isolate NBSH90 was excluded

142 from further sequencing preparation, due to insufficient DNA retrieval resulting from poor

143 growth in liquid culture. For all other isolate DNA extractions, the presence of majority high-

137

144 molecular weight genomic DNA > 40 kb was verified via 1% agarose gel electrophoresis

145 (Section 3.2.6).

146

147 4.2.2 Multi-genome DNA library preparation and sequencing

148 In preparation for sequencing, three multi-genome libraries (A1, A2 and A3) were created

149 from pooled isolate DNA (Table 4.2). For each library, bacteria to be combined were

150 different at genus level based on prior 16S rDNA gene sequencing (Section 3.2.6, Table 3.9).

151 Final DNA submission conformed to PacBio RS II sequencing guidelines of > 20 µg DNA

152 per pooled library. Libraries A1 and A2 each comprised DNA from six isolates, while library

153 A3 contained DNA from five isolates, totalling 17 individual isolates (Table 4.2). To

154 normalise coverage for each individual genome during genome sequencing, an equimolar

155 ratio of DNA was calculated, with correction based on the estimated genome size for each

156 genus.

157

158 Multi-genome libraries were submitted to The Ramaciotti Centre for Gene Function

159 Analysis, at UNSW Sydney (NSW, Australia), for Agencourt AMPure XP (Beckman

160 Coulter, Brea, CA, USA) magnetic bead clean up, SMRT-bell library preparation with 10-

161 20kb Blue Pippin size selection, and PacBio RS II sequencing with P6/C4 chemistry,

162 employing three SMRT cells per library.

163

138

Table 4.2. Distribution of isolates within all three multi-genome DNA libraries.

Est. DNA genome Library

conc. size input Isolate (ng/μL) (Mb) vol. (μL) 1 Streptomyces NBSH44 67.4 8 99 2 Mesorhizobium NBSH29 22.0 7 265 Library 3 Hymenobacter NBH84 44.4 5 94 A1 4 Frondihabitans IN R15 102.5 5 41 5 Paracoccus NBH48 75.5 4 44 6 Leifsonia INR9 140.5 3 18 1 Streptomyces INR7 95.4 8 70 2 Kribbella SPB151 51.9 7 112 Library 3 Geodermatophilaceae NBWT11 134.6 5 31 A2 4 Sphingomonas NBWT7 83.0 5 50 5 Quadrisphaera INWT6 31.4 4 106 6 Novosphingobium NBM11 24.4 3 103 1 Streptomyces NBH77 60.0 8 133 2 Azospirillum INR13 35.6 7 197 Library 3 Pseudarthrobacter NBSH8 41.3 5 121 A3 4 Frigoribacterium NBH87 50.5 4 79 5 Cryobacterium INWT7 50.1 4 80 164

165 4.2.3 De novo genome assembly from multi-genome libraries

166 For the three multi-genome libraries (A1, A2 and A3), de novo assemblies were performed

167 using FALCON v 1.8.6 (Chin et al. 2013). Preassembly seed read cutoffs, corresponding to

168 30X coverage and based on an approximate combined genome size of 30 Mb per library,

169 were predicted using in-house software, SMRTSCAPE (SMRT Subread Coverage &

170 Assembly Parameter Estimator; http://rest.slimsuite.unsw.edu.au/smrtscape). Preassembly

171 length cutoffs for seed reads were 18,289 bp (A1), 14,431 bp (A2) and 14,662 bp (A3),

172 followed by 7,000 bp for each assembly. Circularisation and joining was performed with

173 Circlator v1.4.0 (Hunt et al. 2015), with dependencies prodigal v2.6.3 (Hyatt et al. 2010),

139

174 samtools v1.7 (Li et al. 2009), spades v3.7.0 (Nurk et al. 2013), bwa v0.7.17 (Li 2013),

175 mummer v3.23 (Kurtz et al. 2004) and canu v1.7 (Koren et al. 2017). Assemblies were

176 subjected to two rounds of consensus polishing, using the GenomicConsensus package tool

177 variantCaller v2.2.1, applying the Arrow algorithm (Chin et al. 2013, PacBio, 2019).

178

179 4.2.3.1 Genome annotation, functional prediction and assessment of genome quality

180 PacBio subreads were aligned to assemblies using pbalign v0.3.1

181 (https://github.com/PacificBiosciences/pbalign) and the resulting contigs assigned to

182 individual species using PAGSAT v2.5.1 (Edwards & Palopoli 2015, Edwards et al. 2018),

183 with reference to representative genomes downloaded from the NCBI Refseq database

184 (Appendix Table A3.1). Binned contigs were annotated using Prokka v1.13 (Seemann 2014),

185 with dependencies hmmer v3.1b2 (Eddy 2011), prodigal v2.6.3, tbl2asn v25.6

186 (https://www.ncbi.nlm.nih.gov/genbank/tbl2asn2/), rnammer v1.2 (Lagesen et al. 2007),

187 parallel v20180622 (Tange 2018) and BLAST+/2.7.1 (Camacho et al. 2009). RJE_GFF

188 v0.1.0 was employed to investigate incorrectly split annotations (R.J. Edwards, pers. comm).

189 Prokka-annotated proteins were then aligned to those of uniprot using MULTIHAQ v1.4.1

190 (Edwards et al. 2007), with dependencies BLAST+ v2.7.1, R v3.5.1, mafft v7.310 (Katoh &

191 Standley 2013), clustalw v2.1 (Larkin et al. 2007) and clustalo v1.2.2 (Sievers et al. 2011).

192 Annotated protein sequences were phylogenetically classified into clusters of orthologous

193 groups (COGs), assigned via the COG functional annotator module of WebMGA

194 (http://weizhong-lab.ucsd.edu/webMGA/) (Niu et al. 2011, Tatusov 2000), with an E-value

195 cut-off of 0.001. Relative abundances of COG categories for each Antarctic genome were

196 visualised as a heatmap with R 3.4.0 using the ggplot2 v2.2.1 package. Genome assemblies

197 were quality assessed for completeness and contamination using the CheckM v1.0.7 140

198 command lineage_wf (Parks et al. 2015), with the dependencies prodigal v2.6.3, hmmer

199 v3.1b2 and pplacer v1.1.alpha16 (Matsen et al. 2010).

200

201 4.2.3.2 Phylogenetic analysis of genome-retrieved 16S rDNA genes

202 Antarctic bacterial 16S rDNA gene sequences retrieved from respective genomes were

203 compared against the NCBI BLASTn database (https://blast.ncbi.nlm.nih.gov/Blast.cgi)

204 (Altschul et al. 1990). The sequences of the five most closely related species were exported

205 for construction of phylogenetic trees using the phylogeny.fr tool

206 (http://www.phylogeny.fr/alacarte.cgi) (Dereeper et al. 2008). Sequences were aligned with

207 MUSCLE 3.8.31 (Edgar 2004) in full processing mode and curated with GBlocks 0.91b

208 (Castresana 2000) using default settings. Phylogeny was inferred by maximum-likelihood

209 method with PHYML 3.0 (Guindon et al. 2010), using 100 bootstrap iterations. Resulting

210 newick files were imported to iTOL 4.3.3 (https://itol.embl.de/) (Letunic & Bork 2016) for

211 tree visualisation. The Streptomyces, Hymenobacter, Proteobacterial and non-Streptomyces

212 Actinobacterial clades were visualised separately, via truncation at the corresponding clade

213 branch. Bootstrap values > 50% were displayed.

214

215 4.2.4 Secondary metabolite gene cluster analysis

216 4.2.4.1 AntiSMASH analysis for all Antarctic genomes

217 Un-annotated nucleotide sequences for all Antarctic genome contigs were analysed for BGCs

218 using antiSMASH v5.0.0 (antibiotics and secondary metabolite analysis shell;

219 https://antismash.secondarymetabolites.org/ ) (Medema et al. 2011, Blin et al. 2019), with all

220 optional features enabled including ClusterBLAST, SubclusterBLAST,

141

221 KnownClusterBLAST (Medema et al. 2015, Medema et al. 2011), Pfam analysis (Finn et al.

222 2016) and active site finder (Weber et al. 2015). Resulting clusters were verified manually

223 through visual inspection. This included examination of each BGC for completeness and

224 contiguity, and examination of individual genes and domains within BGC for order and

225 similarity to known sequences. AntiSMASH cluster results were compiled as a table in

226 Appendix Table A3.2, and a summary of BGC categories found in each genome visualised

227 as a heatmap in R 3.4.0 using the ggplot2 v2.2.1 package. For Streptomyces and Kribbella

228 isolates, BGCs and their corresponding similarity to known clusters were mapped to contigs

229 as circular plots, generated using Circa 1.2.1 (http://omgenomics.com/circa/). Predicted

230 chemical structures were created using ChemDraw Prime 15.0 (Perkinelmer, Victoria,

231 Australia).

232

233 4.2.4.2 BLASTp and NaPDoS analysis of detected BGC domain sequences

234 Further analysis was conducted on the five genomes found to contain Type I PKS, NRPS,

235 and Type II PKS clusters, namely the Streptomyces spp. NBSH44, INR7 and NBH77,

236 Kribbella SPB151 and Azospirillum INR13. Amino acid sequences for PKS and NRPS genes

237 were analysed by BLASTp (https://blast.ncbi.nlm.nih.gov/Blast.cgi) against the entire NCBI

238 reference database with default settings (Warren & David 1993). Additionally, amino acid

239 sequences corresponding to PKS ketosynthase domains (KS), and NRPS condensation

240 domains (C) were compiled and analysed using NaPDoS

241 (http://napdos.ucsd.edu/napdos_home.html) (Ziemert et al. 2012), retrieving three matches

242 per sequence. Phylogenetic trees were constructed for KS and C domains in NaPDoS, using

243 the maximum likelihood method, employing MUSCLE alignment of sequences alongside

244 BLAST matches against the curated NaPDoS database, and FastTree (Guindon & Gascuel 142

245 2003) for phylogenetic analysis. The resulting newick files for both KS and C domain

246 analyses were imported into iTOL 4.3.3 for tree visualisation (Letunic & Bork 2016). Trees

247 were pruned to remove multiple trimmings of identical domain sequence regions produced

248 by NaPDoS. Leaves were coloured to indicate the primary bioactivities of the compounds

249 produced by the corresponding homologous pathways. Complete NaPDoS results were

250 compiled alongside BLASTp results in Appendix Tables A3.3 to A3.6.

251

252 4.3 RESULTS

253 4.3.1 Sequencing output and assembly of multi-genome libararies

254 PacBio sequencing for the three multi-genome libraries (A1, A2 and A3) yielded 174,282,

255 132,856 and 138,043 subreads per library, with N50 values of 17,552 bp, 16,476 bp and

256 16,755 bp respectively (Table 4.3). Restricting data to the longest subread per ZMW yielded

257 127,482 subreads (1.80 Gb) for A1; 93,547 (1.27 Gb) for A2; and 96,491 (1.30 Gb) for A3

258 (Table 4.3). FALCON assembly of unique reads for A1, A2 and A3 resulted in 55, 39 and 95

259 contigs respectively, with total lengths of 32.7 Mb, 34.5 Mb and 23.8 Mb per library, with

260 N50 values of 4.4 Mb (A1), 4.1 Mb (A2) and 3.3 Mb (A3) (Table 4.3).

261

143

Table 4.3 Sequencing output and assembly summaries for three multi-genome

libraries.

Combined Unique FALCON Library Sequences subreads sequences assembled Total number 174,282 127,482 55 Total length 2,070,377,334 1,803,951,037 32,727,291 Min. length (bp) 35 36 19,929 A1 Max. length (bp) 46,582 46,582 6,659,843 Mean length (bp) 11,879.47 14,150.63 595,041.65 Median length (bp) 11,065 14,843 63,488 N50 length (bp) 17,552 18,401 4,432,358

Total number 132,856 93,547 39 Total length 1,522,567,248 1,267,874,915 34,533,235 Min. length (bp) 35 37 17,238 A2 Max. length (bp) 45,106 45,106 8,306,415 Mean length (bp) 11,460.28 13,553.35 885,467.56 Median length (bp) 11,070 14,236 33,683 N50 length (bp) 16,476 17,548 4,142,770

Total number 138,043 96,491 95 Total length 1,554,898,583 1,297,410,554 23,820,564 Min. length (bp) 35 38 14,223 A3 Max. length (bp) 42,612 42,612 6,196,405 Mean length (bp) 11,263.87 13,445.92 250,742.78 Median length (bp) 10,725 14,028 39,656 N50 length (bp) 16,755 17,769 3,319,087

262

263 4.3.2 Individual genome assemblies, annotation and quality assessment

264 Of the 17 bacterial genomes sequenced, eight returned very high-quality assemblies,

265 displaying high contiguity (L50=1; N50=3.2-8.3 Mb), uniform coverage (> 40x), and

266 completeness (> 99% complete) (Table 4.4). These were the Streptomyces spp. NBSH44,

267 INR7 and NBH77, Leifsonia INR9, Kribbella SPB151, Sphingomonas NBWT7,

268 Pseudarthrobacter NBSH8 and Cryobacterium INWT7 isolates.

144

Table 4.4 Antarctic bacterial genome assembly and quality assessments.

Assembled Mean G + C Complete Contam Largest Strain Contigs L50 N50 (bp) size (bp) Coverage (%) (%) (%) contig

Streptomyces NBSH44 7,678,790 3 1 7,474,454 94 70 99.7 0.7 Linear Mesorhizobiums x2 INR15-NBSH29 11,524,875 10 1 6,667,022 73 60.9 100.0 100.0 Circular Hymenobacter NBH84 5,388,077 6 1 4,779,090 37 56.7 99.4 0.6 Circular Paracoccus NBH48 2,970,638 12 2 707,228 22 67 82.6 0.4 Linear Leifsonia INR9 4,608,984 3 1 4,438,093 76 70.4 99.5 2.5 Circular Streptomyces INR7 8,320,846 1 1 8,320,846 46 72.4 99.6 0.8 Linear Kribbella SPB151 8,156,807 1 1 8,156,807 58 67.4 99.1 3.2 Circular Geodermatophilaceae NBWT11 4,594,226 1 1 4,594,226 33 74 97.7 0.8 Circular Novosphingobium NBM11 5,274,127 4 1 4,151,563 26 65.6 99.1 1.9 Linear Sphingomonas NBWT7 3,386,925 2 1 3,259,464 45 67.2 99.6 0.7 Circular Quadrisphaera INWT6 4,201,516 8 2 1,065,298 22 75.3 90.1 0.5 Linear Streptomyces NBH77 7,021,282 3 1 6,848,830 62 73.3 99.9 0.4 Linear Azospirillum INR13 4,695,673 50 10 134,676 18 67.4 64.5 4.3 Linear Pseudarthrobacter NBSH8 4,060,736 1 1 4,060,736 64 64.9 99.7 0.2 Circular Frigoribacterium NBH87 3,407,805 2 1 3,328,375 37 73.1 98.5 0.0 Circular Cryobacterium INWT7 3,477,829 2 1 3,328,990 91 66.1 99.5 0.0 Circular Shading indicates very high-quality assemblies; Numbers in bold indicate lower quality markers

269

145

270 A further four genomes, Hymenobacter NBH84, Geodermatophilaceae NBWT11,

271 Novosphingobium NBM11 and Frigoribacterium NBH87, produced assemblies which were

272 within high-quality ranges determined by CheckM (> 95% complete, < 5% contaminated)

273 (Parks et al. 2015), but exhibited slightly lower coverage and uniformity (26-37x),

274 completeness (97-99%) and/or contiguity (L50 < 2; N50=3.3-4.8 Mb) (Table 4.4).

275

276 Three Antarctic bacterial genomes; Paracoccus NBH48, Quadrisphaera INWT6 and

277 Azospirillum INR13 diverged from high-quality indicator range for completeness (83%, 90%

278 and 65% complete respectively), and exhibited fragmentation (8-50 contigs) and low

279 coverage (18-22x) (Table 4.4). Additionally, two isolates formed a dual assembly;

280 Mesorhizobium NBSH29 and isolate INR15, signalling the original misidentification of

281 INR15 as a Frondihabitans species, which instead belongs to Mesorhizobium genus. This

282 was confirmed by CheckM analysis which showed 100% contamination (Table 4.4).

283

284 4.3.3 Annotation and functional distribution of genes

285 Prokka annotation yielded an average of 930 protein-coding sequences per 1 Mb of genome

286 (Table 4.5). Between 70% and 84% of coding sequences (CDS) for each genome were

287 assigned function in COGs analysis (Table 4.5). Of the COG-characterised proteins,

288 categories related to metabolism (C, G, E, F, H, I, P and Q) (Fig. 4.1) comprised the largest

289 overall proportion of COGs, averaging 42%, with amino acid (E) and carbohydrate transport

290 and metabolism (G) groups accounting for the greatest abundance. Group E proteins were

291 particularly abundant in combined Mesorhizobium genomes INR15-NBSH29,

292 Pseudarthrobacter NBSH8 and Paracoccus NBH48, while group G was most abundant in

293 Leifsonia INR9, Frigoribacterium NBH87 and Quadrisphaera INWT6 isolates (Fig. 4.1).

146

Table 4.5 Predicted protein-coding sequences and CDS assigned to COGs.

No. of CDS Genome Isolate CDS tRNA tmRNA rDNA assigned to COG/ (Mb) (% of all CDS) Streptomyces NBSH44 7.7 7021 82 1 18 4962 (70.6) Mesorhizobiums x2 INR15-NBSH29 11.5 11455 102 0 9 9653 (84.3) Hymenobacter NBH84 5.4 4576 49 1 9 3296 (72.0) Paracoccus NBH48 3.0 3053 47 0 9 2576 (84.4) Leifsonia INR9 4.6 4444 53 1 3 3541 (79.7) Streptomyces INR7 8.3 7425 93 1 21 5669 (76.4) Kribbella SPB151 8.2 7851 73 1 9 5785 (73.7) Geodermatophilaceae NBWT11 5.0 4441 53 1 9 3595 (81.0) Novosphingobium NBM11 5.3 4996 60 0 6 3909 (78.2) Sphingomonas NBWT7 3.4 3231 53 0 6 2673 (82.7) Quadrisphaera INWT6 4.2 3922 59 1 9 3077 (78.5) Streptomyces NBH77 7.0 5860 93 1 21 4540 (77.5) Azospirillum INR13 4.7 4447 65 0 24 3573 (80.3) Pseudarthrobacter NBSH8 4.1 3745 53 1 12 3085 (82.4) Frigoribacterium NBH87 3.4 3146 54 1 6 2431 (77.3) Cryobacterium INWT7 3.5 3350 47 1 3 2676 (79.9)

147

294

148

Figure 4.1 Functional classification of protein-coding genes in Antarctic bacterial genomes by abundance of Clusters of Orthologous

Groups (COGs). Isolates are arranged by phyla, followed by genome size, left (largest) to right (smallest). Predicted functions, carbohydrate and amino acid metabolism, transcription and signal transduction were among the most abundant COGs classes.

149

295 COG groups related to information storage and processing (J, A, K, L and B) accounted for

296 approximately 19% of the characterised proteins, with transcription proteins being the most

297 abundant category presented (Fig. 4.1), particularly for Kribbella SPB151, Leifsonia INR9

298 and the three Streptomyces isolates, accounting for approximately 11-13% relative

299 abundance. Proteins involved in cellular processes and signalling COG classes (D, Y, V, T,

300 M, N, Z, W, U and O), accounted for an average of 19% of characterised COGs. Here, signal

301 transduction mechanisms (T) and cell wall/membrane/envelope biogenesis (M) were the

302 most abundant, with group T particularly abundant in Quadrisphaera INWT6, Azospirillum

303 INR13 and Streptomyces INR7 isolates, and group M most abundant in Bacteroidetes isolate

304 Hymenobacter NBH84 (Fig. 4.1). Approximately 20% of proteins were assigned to poorly

305 characterised COG groups (R and S), representing predicted proteins and those of unknown

306 function (Fig. 4.1).

307

308 4.3.4 Phylogenetic analysis based on 16S rDNA genes

309 4.3.4.1 Actinobacteria: Streptomyces group

310 In 16S rDNA gene sequence analysis, Streptomyces isolate INR7 revealed closest homology

311 to S. virginiae (99% sequence identity), a species known to produce the streptogramin

312 antibiotic virginiamycin (Fig. 1.1) (Kingston & Kolpak 1980). In phylogenetic tree analysis

313 by maximum likelihood method (Fig. 4.2), INR7 formed a closely related clade which

314 included S. virginiae S. lavendulae and S. cirratus. Analysis using 16S rDNA gene sequence

315 for Streptomyces spp. is known to produce trees with low support for delineated clades, as

316 seen here (Fig. 4.2) (Nouioui et al. 2018).

317

150

318

Figure 4.2 Maximum likelihood phylogenetic tree of 16S rDNA gene for Streptomyces

Antarctic isolates. The Antarctic Streptomyces spp. belonged to three distinct clades

comprised of closely related species. Numbers at the branches correspond to confidence

values based on 100 bootstrap replications, with only those > 50% shown.

319

320 Streptomyces NBSH44 was most similar to S. finlayi (99%), a species originally isolated

321 from an Egyptian soil rhizosphere (Szabo 1978). In phylogenetic analysis, NBSH44

322 additionally formed a clade with S. clavifer (Fig. 4.2). Streptomyces NBH77 shared an

323 identical 16S rDNA sequence with S. rutgersensis (100%), a species previously reported to

324 produce a bacteriolytic enzyme SR1, active against Gram-positive bacteria (Shimonishi et al.

325 1999). NBH77 was distributed in a closely related clade alongside S. gougerotti and S.

326 intermedius.

327

151

328 4.3.4.2 Actinobacteria: non-Streptomyces group

329 For the non-Streptomyces Actinobacterial isolates Kribbella SPB151, Pseudarthrobacter

330 NBSH8, Frigoribacterium NBH87 and Leifsonia INR9, 99% sequence similarity was

331 observed for the related species K. qitaiheensis, P. phenanthrenivorans, F. endophyticum and

332 L. shinshuensis respectively (Fig. 4.3). Phylogenetic tree analysis confirmed that isolates

333 Geodermatophilaceae NBWT11, Cryobacterium INWT7 and Quadrasphaera INWT6 were

334 potentially novel species, with nearest identities reported to Klenkia marina (97%), F.

335 endophyticum (97%) and Q. granulorum (98%) (Fig. 4.3).

336

337 4.3.4.3 Alphaproteobacteria group

338 Paracoccus NBH48, Mesorhizobium spp. INR15 and NBSH29 and Sphingomonas NBWT7

339 all shared 99% sequence similarity to related species; P. carotinifaciens M. australicum, M.

340 chacoense and Sphingomonas jeddahensis (Fig 4.4). Isolates Azospirillum INR13 and

341 Novosphingobium NBM11 showed lower homology (98%) to related species A. zeae and N.

342 stygium respectively (Fig. 4.4).

343

344

152

345

Figure 4.3 Maximum likelihood phylogenetic tree of 16S rDNA gene for Antarctic

isolates belonging to the Actinobacteria phylum (excepting Streptomyces spp.) Branch

numbers correspond to confidence values based on 100 bootstrap replications, with only

those > 50% shown. The Cryobacterium isolate INWT7, and the Geodermatophilaceae

isolate NBWT11 are likely to be novel, both sharing 97% similarity to known species.

The Quadrasphaera isolate INWT6 shares 98% similarity to known species.

153

346

Figure 4.4 Maximum likelihood phylogenetic tree of 16S rDNA gene for Antarctic

isolates belonging to the Proteobacteria phylum. Numbers at the branches correspond

to confidence values based on 100 bootstrap replications, with only those > 50% shown.

The Azospirillum isolate INR13 and the Novosphingobium isolate NBM11 both 98%

similar to known species.

347

154

348 4.3.4.4 Bacteroidetes: Hymenobacter

349 Prior to genome sequencing, the Hymenobacter isolate NBH84, showed closest similarity

350 (97%) to known species H. xinjiangensis. Following genome sequencing and NCBI database

351 updates, the isolate now shares 99% identity to a newly discovered species, H. defluvii (Fig.

352 4.5).

353

354

Figure 4.5 Maximum likelihood phylogenetic tree of 16S rDNA gene for

Bacteroidetes Antarctic isolate Hymenobacter sp. NBH84. Prior to genome sequencing

the isolate shared 97% homology to known species. Following updates to the NCBI

database, the strain now shows 99% similarity to newly identified species H. defluvii.

Numbers at the branches correspond to confidence values based on 100 bootstrap

replications, with only those > 50% shown.

355

356 4.3.5 Biosynthetic gene clusters detected in Antarctic genomes

357 Across the genomes of all 17 Antarctic bacteria, a total of 147 BGCs were detected using

358 antiSMASH. The greatest number of BGCs were found in Streptomyces isolates (Table 4.6,

359 Fig. 4.6). Strain INR7 carried the most, totalling 31 clusters, spanning 1.4 Mb and

360 representing 17% of the isolate's genome size (Table 4.6, Fig. 4.6). Strain NBSH44 carried

361 26 clusters, which spanned 0.7 Mb in total, comprising 10% of its genome, and isolate 155

362 NBH77 dedicated 14% of its genome to 22 BGCs spanning 1.0 Mb. Kribbella SPB151, had

363 the greatest number of BGCs of the non-Streptomyces isolates, with 10 detected clusters,

364 representing 5.9 % of its genome and covering nearly 0.5 Mb (Fig. 4.6, Table 4.6). Overall,

365 BGC similarity (percent of genes which showed similarity to known clusters) was low, with

366 111 BGCs (75%) displaying < 70% similarity (Table 4.6). Fifty BGCs (34%) shared no

367 similarity to any known clusters (Appendix Table A3.2).

368

Table 4.6 Proportion of Antarctic bacterial genomes dedicated to secondary

metabolite biosynthetic clusters.

Proportion of BGC Total % of <70% Genome BGCs (Mb) genome similar Streptomyces INR7 1.43 17.2 21/31 Streptomyces NBH77 1.01 14.4 14/22 Streptomyces NBSH44 0.79 10.3 19/26 Kribbella SPB151 0.48 5.9 9/10 Cryobacterium INWT7 0.15 4.4 4/5 Pseudarthrobacter NBSH8 0.16 4.0 4/5 Leifsonia INR9 0.18 3.9 4/5 Frigoribacterium NBH87 0.13 3.8 5/5 Paracoccus NBH48 0.10 3.2 2/4 Novosphingobium NBM11 0.16 3.1 5/6 Mesorhizobium x2 INR15_SH29 0.33 2.8 13/13 Geodermatophilaceae NBWT11 0.10 2.3 4/4 Quadrisphaera INWT6 0.08 2.0 1/3 Sphingomonas NBWT7 0.07 1.9 1/2 Azospirillum INR13 0.09 1.9 2/2 Hymenobacter NBH84 0.09 1.7 3/4 369

156

370 Figure 4.6 Biosynthetic gene

371 clusters detected by AntiSMASH in

372 Antarctic bacterial genomes. The

373 most abundant BGCs were terpene

374 and NRPS-containing clusters. The

375 three Streptomyces isolates

376 harboured the greatest number of

377 BGCs, followed by the Kribbella.

378 Isolates are arranged left to right by

379 highest number of BGCs, and clusters

380 are arranged top to bottom, by

381 greatest number of BGC type.

157

382 Terpenes were the most abundant BGC class identified, with 30 clusters predicted. This was

383 followed by NRPSs (15), Type III PKSs (14), bacteriocins (9) and siderophores (9) (Fig 4.6).

384

385 A quarter of all BGCs were comprised of several classes of biosynthetic machinery, including

386 chemical hybrids (Fig. 4.6, Appendix Table A3.2). Overall, 44% of clusters contained NRPS

387 and/or PKS genes. Combined, the number of clusters containing NRPS genes were

388 comparable with the number of terpenes (30). Type I PKS-containing clusters totalled 19, the

389 majority of which were hybrid NRPS/ Type I PKS clusters (12). Only two Type II PKS

390 clusters were detected.

391

392 4.3.6 Biosynthetic gene cluster verification for Streptomyces, Kribbella and

393 Azospirillum isolates

394 4.3.6.1 Streptomyces INR7 BGCs

395 Manual inspection of the 31 BGCs detected in Streptomyces isolate INR7 revealed 10 clusters

396 with high homology (> 70%) to known BGCs, indicating encoding of the same or similar

397 products (Fig. 4.7, Appendix Table A3.2). These included two different ribosomally

398 synthesised post-translationally modified peptides (RiPPs): SapB, which acts as a surfactant

399 during formation of aerial hyphae (100% sim, Region 29); and venezuelin, a class IV

400 lanthipeptide of unknown activity (100% sim, Region 31) (Straight et al. 2006, van der Donk

401 & Nair 2014). Others included siderophores, coelichelin (100% sim, Region 7) and

402 desferrioxamine B (83% sim, Region 22) (Figs. 1.2 and 4.7); three terpenes: 2-

403 methylisoborneol, geosmin and avermitilol (100% sim, Regions 6, 14 & 26); a Type III PKS

404 encoding an phenolic lipid antimicrobial compound also involved in cyst formation,

158

405 alkylresorcinol (100% sim, Region 2) (Funa et al. 2006); and two NRPSs, one encoding a

406 tetrapeptide antitumour compound, tambromycin (100% sim, Region 25) (Fig 4.7, Appendix

407 Table A3.2), and one similar to that encoding broad-spectrum antibiotic streptothricin (87%

408 sim, Region 28) (Yu et al. 2018). For the tambromycin BGC (Region 25), gene placement

409 was identical to the known compound cluster (Fig. 4.8), with individual genes showing 91-

410 99% identity. This indicates the same compound is likely to be produced by INR7.

411

412 For the remaining 21 clusters detected in Streptomyces INR7, inspection revealed lower

413 homology to known BGCs (< 70% genes showed similarity), suggesting the encoding of

414 different end products. For three of these regions, a large proportion of genes were similar to

415 known BGCs, but gene order differed, and several genes were absent. They most closely

416 matched BGCs encoding hopene, a sterol-like membrane lipid which affects membrane

417 fluidity (Seipke & Loria 2009, Nett et al. 2009) (61% sim, Region 11), a curamycin-like Type

418 II PK spore pigment (63% sim, Region 27), and an isorenieratene-like carotenoid terpene

419 (66% sim, Region 30). Fourteen BGCs displayed < 50% similarity to known BGCs, with

420 closest matches encoding for kedarcidin, A54145, istamycin, monensin, chloramphenicol,

421 echosides, herboxidiene, svaricin, elloramycin, friulimicin, jerangolid, RK-682, kinamycin

422 and himastatin. Four BGCs were not similar to any known cluster (Appendix Table A3.2).

423 Interestingly, although S. virginiae is typically known as a producer of the antibiotic

424 virginiamycin (Pulsawat et al. 2009), this cluster was not detected in the INR7 strain.

425

426

159

160

Figure 4.7 Circular representation of the Streptomyces INR7 genome. The position and type of BGCs detected by antiSMASH are depicted as coloured bars. For each BGC, the percentage of genes which showed similarity to known BGCs are displayed in the inner ring. Clusters likely to produce the same or similar compound as the closest BGC match have the compound name in the outer ring. The INR7 genome was a single, linear contig 8.3 Mb in length, with approximately 17% of the genome dedicated to BGC.

161

427 .

Figure 4.8 The Streptomyces INR7 genome contains an NRPS BGC, Region 25, with

100% gene similarity to the tambromycin BGC. Tambromycin is an antitumour

compound containing an unusual amino acid, tambroline (shaded). The region is

contiguous and individual genes show high protein sequence similarity (91-99%),

indicating the same compound is likely produced. Tambromycin structure adapted from

Goering et al. (2016).

428

429 4.3.6.2 Streptomyces NBH77 BGCs

430 Of the 22 clusters detected in Streptomyces NBH77, six showed potential for biosynthesis of

431 the same or similar compounds, exhibiting 100% gene cluster similarity (Fig. 4.9). These

432 included ectoine, a compound which assists microorganisms to cope with osmotic stress

433 (Vicente et al. 2018) (Region 6), siderophore desferrioxamine B (Region 7), the odiferous

434 compound geosmin (Region 14), the antibiotic terpene albaflavenone (Region 13), a

435 polycyclic tetramate macrolactam, SGR PTMs (Region 19), and a large NRPS-Type I PKS

436 cluster encoding two compounds; the antibiotic antimycin, and the antifungal compound,

162

437 candicidin (Region 3) (Fig. 4.9, Appendix Table A3.2). The Region 3 BGC spanned a total

438 of 209 kb, with the NRPS-Type I PKS hybrid region matching the antimycin BGC, which

439 includes a gramicidin synthetase, with 82-97% individual gene sequence identity (Fig. 4.10).

440 The neighbouring Type I PKS cluster matched both candicidin and FR-008 clusters, but with

441 an additional transposase gene (Fig. 4.10). Individual genes within the Type I PKS show high

442 homology to both candicidin and FR-008 BGC genes, with 84-95% identity.

443

444 Two Streptomyces NBH77 clusters, while showing high similarity to known terpene clusters,

445 encoding isorenieratene (85% sim, Region 5) and hopene (76% sim, Region 18), displayed

446 different gene placements but may produce similar compounds. Eight BGCs showed low

447 homology (7-40% genes similar) to known clusters, encoding mannopeptimycin,

448 daptomycin, chloramphenicol, herboxidiene, leinamycin, pellasoren and scabichelin. Six

449 clusters displayed no similarity to known BGCs (Appendix Table A3.2).

450

163

164

Figure 4.9 Circular representation of the Streptomyces NBH77 genome. The position and type of BGCs detected by antiSMASH are depicted

as coloured bars. For each BGC, the percentage of genes which showed similarity to known BGCs are displayed in the inner ring. Clusters likely

to produce the same or similar compound as the closest BGC match have the compound name in the outer ring. The NBH77 genome was

comprised of one large linear contig, 6.8 Mb in length, and two smaller contigs, 147 kb and 25 kb in length which are putative plasmids. No

BGCs were detected in the second or third contigs. BGCs spanned over 14% of the NBH77 genome.

451

165

452

Figure 4.10 The Streptomyces isolate NBH77 NRPS-Type I PKS BGC, Region 3. The hybrid NRPS-Type I PKS region closely resembles the

antimycin cluster, an antibiotic, with 100% of genes showing similarity. The neighbouring Type I PKS region closely resembles clusters encoding

antifungal polyene macrolides, candicidin and FR-008, with individual gene identity of 84-95%, except that an extra transposase gene was present

between FscT11 and FscC. The antimycin/ candicidin BGCs are known to cluster together in other harbouring species.

166

453 4.3.6.3 Streptomyces NBSH44 BGCs

454 Of the 26 BGCs detected in Streptomyces NBSH44, five were highly similar (100% of genes

455 showed similarity) to clusters encoding for isorenieratene (Region 1), melanin (Region 5),

456 ectoine (Region 19), AmfS RiPP (Region 22), and geosmin (Region 23) (Fig 4.11, Appendix

457 Table A3.2), suggesting the production of the same or similar products. Two BGCs were

458 highly similar but displayed alternate gene placement; desferrioxamine B (80% sim, Region

459 15) and hopene (84% sim, Region 2). The remaining 19 NBSH44 BGCs showed low

460 homology (< 51% sim) to known clusters, which included β-lactam carbapenem MM 4550,

461 lactonamycin, lysolipin, mannopeptimycin, enduracidin, clorobiocin, goadsporin,

462 bacillibactin, steffimycin, maduropeptin, clavulanic acid and thiolutin gene clusters.

463

464 One region showed 42% gene similarity to antitumour compound C-1027 (Contig 2, Region

465 2) (Fig. 4.11, Appendix Table A3.2). On further examination, the cluster revealed high

466 similarity (100% sim) to a C-1027 sub-cluster, indicating likely production of a C-1027-like

467 enediyne (Fig. 4.12). Individual genes showed high similarity (85-96%). However, additional

468 nuclease and transcriptional regulator domains were present, suggesting that the NBSH44

469 sequence may be more complete. Six NBSH44 clusters shared no association with any known

470 BGCs (Fig 4.11, Appendix Table A3.2).

471

167

168

Figure 4.11 Circular representation of the Streptomyces NBSH44 genome. The position and type of BGCs detected using antiSMASH are depicted as coloured bars. For each BGC, the percentage of genes which were similar to known BGCs are displayed in the inner ring. Clusters likely to produce the same or similar compound as the closest BGC match have the compound name in the outer ring. The NBSH44 genome was comprised of one large linear contig, 7.4 Mb in length, and two smaller contigs, 180 kb and 23 kb in length which are most likely plasmids. The second contig contained three BGCs. In total, BGCs spanned approximately 10% of the NBSH44 genome.

169

472

Figure 4.12 The putative plasmid, contig 2, carried by Streptomyces isolate NBSH44,

contains a Type 1 PKS (Region 2) with 42% of genes similar to the antitumour compound

C-1027 BGC. The sub-cluster, encoding an enediyne, shared 100% gene similarity,

suggesting capacity for production of a C-1027-like enediyne chromophore. Several

additional genes were present, which may indicate the NBSH44 sequence is more

complete than the matching sub-cluster. C-1027 enediyne structure adapted from Liu et

al. (2002).

473

474 4.3.6.4 Kribbella SPB151 BGCs

475 Of the ten BGCs uncovered in Kribbella isolate SPB151, only one showed high gene

476 similarity (100%) to a known cluster; alkylresorcinol (Region 8) (Fig 4.13). Four BGCs

477 exhibited low gene similarity (< 35%) to asukamycin, avilamycin, albachelin and

478 thiocoraline gene clusters, and five BGCs had no known similar homologs in the database

479 (Fig. 4.13, Appendix Table A3.2).

480

170

481 4.3.6.5 Azospirillum INR13 BGCs

482 The fragmented, low coverage genome of Azospirillum isolate INR13 contained two BGCs

483 which exhibited low identity to those known. Closest matches were to BGCs encoding for

484 fengycin and anthracimycin respectively (~20% of genes showed similarity). Interestingly,

485 the anthracimycin-like cluster (Contig 10, Region 1), showed high sequence similarity (100%

486 sim) to the polyunsaturated fatty acid (PUFA) biosynthetic gene cluster from the genome of

487 the terrestrial myxobacteria genus, Aetherobacter (Fig. 4.14). The genes appear fragmented

488 in comparison to the genome of similar Azospirillum species and revealed only 54-57%

489 individual gene identity with the Aetherobacter sp. genes, Pfa1, Pfa2 and Pfa3. However,

490 there is potential that the Azospirillum species may have the capacity to produce

491 biotechnologically important PUFA compounds (Fig 4.14).

492

493 4.3.6.6 PKS and NRPS gene amino acid sequence similarity to known genomic regions

494 When compared with amino acid sequences in the Genbank database, PKS and NRPS-

495 containing genes for Streptomyces isolates INR7 and NBH77 revealed close matches (84-

496 100%, av. 99%) to previously sequenced genome regions (Appendix Table A3.3-A3.6). On

497 average, lower similarity matches were observed for both Streptomyces NBSH44 (40-91%,

498 av. 83%), and Kribbella SPB151 isolates (49-92%, av. 76%) (Appendix Table A3.3-A3.6).

499 The PKS genes in Azospirillum INR13 shared 93-95% identity to known genome regions

500 (Appendix Table A3.3-A3.6).

171

172

Figure 4.13 Circular representation of the Kribbella SPB151 genome. The position and type of BGCs detected in by antiSMASH are depicted as coloured bars. For each BGC, the percentage of genes which showed similarity to known BGCs are displayed in the inner ring. The Type III

PKS cluster encoding synthesis of alkylresorcinol was the only gene cluster likely to produce the same or similar compound as the closest BGC match. The SPB151 genome was comprised of one large circular contig, 8.1 Mb in length, with BGCs spanning approximately 6% of the genome.

173

501

Figure 4.14 The Azospirillum isolate INR13 harbours a potential polyunsaturated

fatty acid (PUFA) synthase cluster. 100% of genes showed similarity to an

Aetherobacter sp. cluster known to produce PUFA. The genes are fragmented in the

INR13 strain, in comparison to the same region detected in similar species Azospirillum

lipoferum.

502

503 4.3.7 NaPDoS analysis of ketosynthase and condensation domains

504 Across all BGCs, a total of 64 PKS ketosynthase domains (KS) and 100 NRPS condensation

505 domains (C) were identified. These were exclusively found in Streptomyces strains INR7,

506 NBSH44 and NBH77, and Kribbella SPB151 and the Azospirillum INR13. Overall, these

507 domains exhibited low protein sequence identity to NaPDoS database pathway domain

508 sequences, averaging 63% for KS and 44% for C domains. This indicated that the encoded

509 products are likely to differ from those of the pathways curated in the database (Appendix

510 Table A3.3-A3.6). For the NaPDoS database, a domain similarity threshold of > 85% is

511 suggested to encode the same or similar compound (Ziemert et al. 2012). Here, only a single

174

512 domain identity passed this threshold; the Streptomyces NBSH44 contig 2, Region 2 KS

513 domain, which aligned with C-1027 enediyne with 92% similarity, a result which

514 complements the BGC antiSMASH match.

515

516 Interestingly, for the Streptomyces NBH77 candicidin-like cluster (Region 3), KS domains

517 aligned most closely with another polyene macrolide antifungal pathway; nystatin (68-82%

518 identity), rather than candicidin (Fig 4.12). Overall, 33% of Antarctic KS domains aligned

519 most closely with NaPDoS-curated pathways encoding polyene macrolide antifungal

520 compounds. A further 36% of KS domains showed similarity to antitumour compounds, such

521 as epothilone, leinamycin and alnumycin, with sequence similarities ranging from 42-72%.

522

523 Four KS domains aligned most closely with PUFA type domains, and these relationships

524 were confirmed by phylogenetic tree analysis (Fig. 4.15). Two of these were from the

525 Azospirillum INR13 genome, confirming the antiSMASH result. Interestingly, the remaining

526 two PUFA-like KS domains were detected in Streptomyces INR7, within two separate Type

527 I PKS BGCs (Regions 9 and 24) (Appendix Table A3.2). Respectively, these INR7 BGCs

528 exhibited 0% and 3% similarity to known BGCs, but both show high identity to previously

529 sequenced genome regions, suggesting as-yet-uncharacterised biosynthetic pathways.

530

531 A large proportion (28%) of the bacterial isolate genome C domains showed closest similarity

532 to lipopeptide pathways encoding biosurfactants, such as syringomycin and iturin, albeit with

533 low sequence similarity (av. 39%), indicating encoding of different lipopeptides. These

534 relationships were confirmed following phylogenetic analysis, where domains from multiple

535 isolates formed clades but did not align to any known database pathways (Fig 4.16). Other

175

536 domains formed closer relationships to pathway domains encoding antitumour compounds

537 actinomycin, thiocoraline and bleomycin, and antimicrobial pathways such as calcium-

538 dependent antibiotic (CDA) and pristinamycin (Fig 4.16).

176

Figure 4.15 Phylogenetic analysis of ketosynthase domains by maximum likelihood method against

NaPDoS database domains.

The majority of KS domains showed closest similarity to pathways encoding antitumour and antifungal compounds.

PUFA-like domains were found in both the Azospirillum INR13 and Streptomyces INR7 strains.

177

Figure 4.16 Phylogenetic analysis of condensation domains by maximum likelihood method against

NaPDoS database domains.

Predominantly, C domains aligned most closely to pathways encoding antitumour, antimicrobial and surfactant compounds.

178

539 4.4 DISCUSSION

540 Here, we report high-quality genome assemblies for twelve Antarctic bacteria, produced

541 through long-read PacBio sequencing of multi-genome libraries. The high-quality assemblies

542 were classed here as being 97.7-99.9% complete, with low contamination (0-4.3%) and high

543 contiguity (N50 > 3.2 Mb) (Table 4.4) (Parks et al. 2015, Sedlazeck et al. 2018, Koren et al.

544 2013). The genomes of Streptomyces, Kribbella, Sphingomonas, Novosphingobium,

545 Leifsonia, Geodermatophilaceae, Pseudarthrobacter, Hymenobacter, Cryobacterium and

546 Frigoribacterium are some of the first complete bacterial genomes thus far described from

547 the arid desert soils of eastern Antarctica. Multi-genome sequencing methods were unable to

548 resolve complete assemblies for five additional isolates, four of which displayed

549 fragmentation and low coverage, most likely due to either complications during DNA

550 extraction, or preferential sequencing of dominant strains due to insufficient DNA input

551 (Table 4.4). Additionally, two strains could not be distinguished, due to original taxon

552 misidentification, resulting in closely related species being combined in the same multi-

553 genome library.

554

555 The greatest NP capacity was demonstrated by Actinomycetales with genomes > 7 Mb in

556 size, which carried 10-31 BGCs each, specifically the Streptomyces and Kribbella isolates.

557 This is consistent with previous reports which show correlation between BGC carriage and

558 genome size, with Actinomycetales known to be particularly prolific in terms of NP

559 biosynthesis (Baltz 2017, Wang et al. 2014, Donadio et al. 2007). Here, Streptomyces spp.

560 had a mean genome size of 7.7 Mb, harbouring an average of 26 BGCs spanning 1.1 Mb, or

561 ~14% of the genome. These results are fractionally lower than those reported by Baltz (2017),

562 who found an average Streptomyces genome size of 9.3 Mb, with ~35 BGC, covering 1.5 179

563 Mb, and representing 16% of total genome size (Baltz 2017). In contrast, the average

564 prokaryote genome dedicates ~4% of their genome to BGCs (Cimermancic et al. 2014).

565

566 Clusters with similarity to those encoding desferrioxamine, geosmin and hopene were found

567 in all three Antarctic Streptomyces genomes, which along with those encoding melanin,

568 ectoine, and isorenieratene are known to be highly conserved in Streptomyces (Kim et al.

569 2015, Vicente et al. 2018, Komaki et al. 2018) (Appendix Table A3.2). An additional set of

570 BGCs were common across a variety of the Antarctic bacterial genomes, namely

571 alkylresorcinol (5 spp.), and a diversity of carotenoid clusters (12 spp.). Together, these

572 groups accounted for 26 of the 36 BGCs which showed high similarity (> 70%) to known

573 clusters, and emphasise the ecological roles for secondary metabolites in the natural

574 environment; iron chelation, defense, signalling, and protection against environmental

575 stressors (Kim et al. 2015, Vicente et al. 2018, Komaki et al. 2018, Adamek et al. 2018).

576

577 Three quarters of all BGCs detected in Antarctic bacterial strains showed < 70% gene cluster

578 similarity to known BGCs (Appendix Table A3.2). Compared with known clusters they

579 exhibited alternate gene order, absent or additional genes, matches only to accessory genes,

580 and/or low individual gene identity (< 60%) to those known. As genetic variance correlates

581 with compound structural variance, clusters with moderate similarity to known clusters may

582 encode molecules within the same class, but which structurally diverge from the known

583 pathway product (Crits-Christoph et al. 2018, Blin et al. 2017a). Furthermore, a third of all

584 BGCs across Antarctic genomes shared no similarity to any known cluster, implying new

585 chemical entities, or compounds for which the BGC remains uncharacterised (Blin et al.

586 2017a, Baldim et al. 2017, Challis 2008).

180

587

588 At least three known NP compounds; tambromycin, antimycin and candicidin, are likely

589 biosynthesised by PKS and NRPS pathways in the Antarctic Streptomyces isolates INR7 and

590 NBH77, inferred from highly analogous gene clusters (Figs. 4.8 & 4.12). Tambromycin is an

591 unusual antitumour agent incorporating a novel pyrrolidine‐containing amino acid,

592 tambroline. Although the compound has only recently been uncovered, the tambromycin

593 cluster appears widely distributed amongst the S. virginiae clade, to which INR7 belongs

594 (Goering et al. 2016, Zhang et al. 2018) (Figs. 4.8 and 4.2). The combined antimycin and

595 candicidin cluster (NBH77, Fig. 4.10) is also widely disseminated amongst various

596 Streptomyces spp. (Caffrey et al. 2016, Jorgensen et al. 2009). Antimycin is a depsipeptide

597 which exhibits diverse bioactivities including antifungal, insecticidal, nematocidal and

598 antitumour properties, while the candicidin polyene macrolide is an antifungal (Joynt &

599 Seipke 2018, Caffrey et al. 2016) (Figs. 4.10). Across the Antarctic genomes, analysis of all

600 PKS ketosynthase and NRPS condensation domains revealed closest similarity to pathway

601 domain sequences encoding for antitumour, antifungal, antimicrobial and biosurfactant

602 compounds. Here, protein sequence identities were < 85%, indicating product structural

603 divergence (Figs. 4.15, & 4.16, Appendix Table A3.3-A3.6).

604

605 The cold-adapted Streptomyces sp. (NBSH44), which was recovered by novel SSMS

606 methods (Section 3.2.3, Fig. 3.14D), and the Kribbella, a rarely-cultured Actinobacterial

607 genus, both exhibited lower BGC gene similarity to those of known genome regions (av. 76-

608 83%), when compared with the Streptomyces strains INR7 and NBH77 (av. 99%) (Appendix

609 Tables A3.3-A3.6). This suggests that rarer Actinobacteria with large genomes are

610 particularly promising targets for novel NP discovery.

181

611

612 Antarctic bacteria are established sources of unusual fatty acids such as LC-PUFA, whose

613 primary functions involve maintenance of membrane fluidity and nutrient transport at cold

614 temperatures (Gemperlein et al. 2014, Bianchi et al. 2014, Nichols et al. 1993). Additionally,

615 psychrophilic bacteria, Serratia, have been found to incorporate PUFA-synthase like

616 machinery in the formation of unusual zeamine antibiotics (Masschelein et al. 2015). Here,

617 two isolates were found to contain BGCs with PUFA-domain sequence homology;

618 Streptomyces INR7 and Azospirillum INR13 (Fig. 4.15; Appendix Table A3.4). The

619 Azospirillum PUFA-like region is particularly exciting, exhibiting high similarity to the

620 PUFA synthase of Aetherobacter species. In Aetherobacter, the PUFA cluster encodes for

621 production of the omega-6 PUFA, Arachidonic acid (AA); and two omega-3 PUFAs,

622 eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) (Fig. 4.14). PUFA-producing

623 bacteria are almost exclusively slow-growing psychrophiles, giving limitation to their

624 industrial value. In contrast, Azospirillum INR13 grows abundantly at RT. If it indeed

625 produces PUFA, the strain may be highly desirous for commercial purposes as a sustainable

626 source of nutritional supplements for human health industries and aquaculture (Gemperlein

627 et al. 2014). Thus, this isolate represents a worthy target for future fatty acid profiling work.

628

629 In conclusion, generation of high-quality bacterial genomes from eastern Antarctica here

630 highlighted the presence of unusual BGCs which represent exciting targets for medically

631 related NPs discovery, such as potential antifungal, antitumour and antibacterial compounds;

632 as well as industrially important clusters putatively encoding lipids, pigments, biosurfactants

633 and siderophores, which provide insights into cold-adapted microbial ecology. The genomes

634 revealed a high proportion of uncharacterised BGCs, confirming that Antarctic bacteria,

182

635 especially those from rarely isolated and psychrotrophic Actinomycetales groups, show great

636 promise regarding novel NPs.

183

184

CHAPTER FIVE

5 DISCUSSION AND CONCLUSIONS

1 5.1 RESEARCH MOTIVATIONS AND OBJECTIVES

2 Polar regions represent some of the most extreme habitats on Earth. Through adaptation,

3 microorganisms survive and thrive in these inhospitable environments, enduring freezing

4 temperatures, extreme nutrient and water limitation, high UV radiation, long periods of

5 darkness and frequent freeze-thaw cycles (Yergeau et al. 2012, Obbels et al. 2016, Aislabie

6 et al. 2004). Polar soils are exciting targets for NP biomining owing to an abundance of

7 Actinobacteria, Proteobacteria and uncharacterised taxa (Ji et al. 2017), coupled with an

8 elevated novel metabolite potential, predicted as a response to the unique environment (de

9 Pascale et al. 2012). The biosynthetic capacity of polar microbiomes, however, has remained

10 largely unknown. This thesis has examined the NP capacity of cold-adapted bacteria residing

11 in desert soils of eastern Antarctica and the high Arctic, revealing a wealth of novel NP gene

12 diversity.

13

14 The cultivation of microbes remains critical to NPs discovery. The microbial dark-matter,

15 which are resistant to traditional culturing approaches, still vastly outnumber characterised

16 taxa (Lloyd et al. 2018, Rappé & Giovannoni 2003). A major impediment to progress in the

17 NPs field has been the continual re-discovery of identical metabolites from similar microbial

18 species (Bérdy 2005, Masschelein et al. 2017, Baltz 2007, Harvey et al. 2015). Therefore,

19 the discovery of new metabolites demands the capture of novel or rarely-isolated groups.

20 Here, two novel oligotrophic culturing methods resulted in the capture of 47 Antarctic 185

21 bacterial species, predominantly from two of the most prolific phyla associated with bioactive

22 compound production; the Actinobacteria and Proteobacteria, and included rarely-cultured

23 Actinomycetales genera such as Frigoribacterium and Janibacter, as well as 18 members of

24 the biosynthetically rich Streptomyces genus.

25

26 The field of NPs discovery has been revived by contemporary advancements in sequencing

27 and bioinformatics (Milshteyn et al., 2014). In this study, the 3rd generation long-read

28 sequencing platform PacBio RS II was employed to retrieve long DNA reads for biosynthetic

29 gene analysis; from soil directly, and to obtain whole genomes from newly isolated Antarctic

30 bacteria. First, amplicon sequencing was used to survey > 200 polar soils for PKS

31 ketosynthase/acyltransferase and NRPS adenylation domains. Sequences predominantly

32 showed low protein sequence similarity (< 70%) to known genes, aligning most closely to

33 domain sequences encoding antifungal, antitumour and antimicrobial/surfactant compounds.

34 Further, arid Antarctic soils showed the greatest biosynthetic potential. Secondly, long-read

35 sequencing was used to obtain whole genomes from 17 of the isolated Antarctic bacterial

36 species. High-quality assemblies were obtained, revealing 147 BGCs in total, of which the

37 majority displayed < 70% similarity to known BGCs. In accordance with amplicon

38 sequencing results, many PKS and NRPS domains aligned most closely to antifungal,

39 antitumour and antimicrobial/surfactant-encoding genes.

40

186

41 5.2 KEY FINDINGS

42 5.2.1 Soil fertility is associated with natural product gene presence and diversity in

43 polar desert soils

44 Limitations of certain nutrients; carbon, nitrogen, phosphorous and iron, are known to have

45 regulatory roles in microbial secondary metabolite biosynthesis in laboratory settings (van

46 der Heul et al. 2018). Associations between low soil carbon and moisture, and biosynthetic

47 gene richness across diverse soil biomes have also been identified (Charlop-Powers et al.

48 2014), with arid soils predicted to be promising targets for NP bioprospecting, due to their

49 high abundance of Actinomycetales (Charlop-Powers et al. 2014). In Chapter 2, correlations

50 were indeed observed between biosynthetic genes and soil fertility factors (Fig. 2.12, Table

51 2.3), with significant negative associations observed between soil carbon, nitrogen and

52 moisture, and the detection (P < 0.001) and richness (P < 0.05) of the targeted biosynthetic

53 domains (Section 2.3.6). In polar soils no trend was observed between natural product

54 domain richness and the relative abundance of Actinobacteria or Actinomycetales, but

55 significant correlations (P = 0.001) were found between total bacterial diversity and NP gene

56 diversity (Section 2.3.6). We found that NP gene communities displayed closest similarities

57 at the regional and local level (Fig. 2.13), with relatedness patterns highly similar to both

58 phylogenetic diversity, and soil environmental parameters (Fig. 2.14). This indicates that the

59 microbial communities at these sites are significantly influenced by abiotic soil conditions

60 (Dumbrell et al. 2010, Ferrari et al. 2015).

61

62 The increased presence and diversity of BGCs in more nutrient-limited polar soils is

63 intriguing, and supports the ecological relevance of secondary metabolism in terms of

64 survival and competition between microbiota for scarce resources (de Pascale et al. 2012). 187

65 BGCs commonly span long genetic regions, and biosynthesis of their encoded compounds

66 comes at a high metabolic cost (Pickens et al. 2011, Bruns et al. 2018, Fischbach et al. 2008).

67 For the microbes that carry them, the consequences of BGC maintenance include increased

68 genome length, and thus, replication burden (Bruns et al. 2018). Strong selection pressure

69 towards metabolic efficiency is known to lead to expulsion of functionally superfluous DNA

70 (Bruns et al. 2018, Ofria et al. 2003, Lynch 2006). This suggests that the BGCs harboured

71 here by Antarctic bacteria, even if silent under laboratory conditions, remain functional in

72 environmental settings (Bruns et al. 2018).

73

74 5.2.2 Bacterial adaptation to the Antarctic environment includes desiccation-,

75 starvation- and radiation- resistance

76 In Chapter 3, culturing resulted in the recovery of 34 Actinobacteria, 11 Proteobacteria, and

77 one each of Bacteroidetes and Firmicutes phyla (Tables 3.4 & 3.5). These four phyla typically

78 dominate environmental culturing efforts, including those from Antarctic soils (Smith et al.

79 2006, Cary et al. 2010, Zdanowski et al. 2013, Pudasaini et al. 2017, van Dorst et al. 2016).

80 In molecular surveys too, the polar desert sites contained an abundance of Actinobacteria (av.

81 17-43%) and Proteobacteria (av. 9-42%) (Fig. 2.9). While ubiquitous in all soils,

82 Actinobacteria and Proteobacteria have adapted well to both hot and cold desert systems

83 (Battistuzzi & Hedges 2009, Makhalanyane et al. 2015a, Cary et al. 2010). For

84 Actinobacteria, this is generally attributed to an increased tolerance to desiccation and

85 starvation (Delgado-Baquerizo et al. 2018), provided by the Gram-positive cell wall, in

86 addition to strategies which exploit dormancy during unfavourable conditions, which

87 includes the development of spores, and for the non-spore formers, cyst-like resting cells

88 (Soina et al. 2004). These dormant forms are highly resistant to environmental challenges,

188

89 and resource limitation has been shown to regulate dormancy in natural microbial

90 populations (Lennon & Jones 2011, Battistuzzi & Hedges 2009, Makhalanyane et al. 2015a).

91 Atmospheric trace gas scavenging has been implicated as an important survival strategy

92 during dormancy (Greening et al. 2015), including for Actinobacteria in Antarctic soils (Ji et

93 al. 2017). For the Proteobacteria, of which the Gamma and Alpha groups typically dominate

94 in deserts (Makhalanyane et al. 2015a), aerobic anoxygenic phototrophy may be an important

95 survival strategy for certain genera (Section 3.4) (Makhalanyane et al. 2015a, Tahon &

96 Willems 2017). Indeed, several known AAP species were recovered in Chapter 3, the

97 Sphingomonas and Methylobacterium genera (Table 3.4).

98

99 A high proportion (~80%) of isolates cultivated in Chapter 3 (Table 3.4 & 3.5) are members

100 of genera shown to have high desiccation and radiation tolerance; Geodermatophilus,

101 Rhodococcus, Arthrobacter, Sphingomonas, Methylobacterium, Hymenobacter,

102 Streptomyces, Microbacterium, Micrococcus and Planococcus (Narvaez-Reinaldo et al.

103 2010, McBride et al. 2014, Marizcurrena et al. 2019, Barnard et al. 2013, Rainey et al. 2005).

104 Co-occurrence of desiccation and radiation resistance is common, as both stressors result in

105 accumulation of free radicals, leading to analogous damage to DNA. The resistance to

106 radiation is therefore believed to be a secondary adaptation to desiccation resistance

107 (Musilova et al. 2015). The prevalence of the above genera in culturing studies from

108 Antarctic soils (Nicetic 2016, Pudasaini et al. 2017, Peeters et al. 2011, Tahon & Willems

109 2017) suggests endemism, and supports phylogenetic surveys suggesting that desiccation and

110 radiation-resistant groups show a higher prevalence in Antarctica (Cowan et al. 2014,

111 Musilova et al. 2015). But it also demonstrates their versatility for adaptation to artificial

112 cultivation, and to more copiotrophic growth conditions (Fierer et al. 2007), compared with

189

113 more fastidious taxa such as the rarely-cultured phylum Chloroflexi (Hanada 2014), which

114 remain uncultured in our studies despite high abundance at some sites (Fig. 2.9).

115

116 In Chapter 3, lengthy incubation times (> 100 days) (Table 3.4 & 3.5), resulted in the recovery

117 of Geodermatophilus, Mesorhizobium and Hymenobacter isolates predicted to be novel at

118 species level (97-98% identity), as well as rarely cultured Actinomycetales, Frigoribacterium

119 and Janibacter (Tiwari & Gupta 2013). The novel culturing approaches (DSC and SSMS)

120 used here were successful in capturing a diversity of Actinobacteria (Table 3.4 & 3.5),

121 particularly the Streptomyces spp. (18 species), most of which were recovered by DSC (Table

122 3.4). Results here suggest that three main Streptomyces clades are endemic in eastern

123 Antarctic soils. These are:

124

125 • The S. lavendulae / S. spororaveus / S. virginiae clade (Fig 4.2) (Labeda et al. 2017,

126 Cheng et al. 2016); of which members have been isolated from Herring Island (Table

127 3.4), Adams Flat (Wong 2018), Robinson Ridge (Table 3.3) (Nicetic 2016) and

128 Browning Peninsula soils (Table 3.3) (Pudasaini et al. 2017). Importantly, this clade

129 displayed the greatest measurable antimicrobial activity overall (Table 3.7 & 3.8),

130 with broad-spectrum activity, spanning Gram-positive, Gram-negative and fungal

131 pathogens.

132

133 • The S. badius / S. clavifer / S. finlayi / S. griseus / S. parvus/ S. pratensis clade (Fig

134 4.2) (Labeda et al. 2017, Cheng et al. 2016); which have been recovered from Herring

135 Island, Mitchell Peninsula, Rookery Lake (Tables 3.4, 3.5) and Adams Flat (Wong

190

136 2018). For this clade, antimicrobial activity was primarily confined to Gram-positive

137 pathogens (Tables 3.7 & 3.8).

138

139 • The S. fildesensis / S. beijiangensis clade (Labeda et al. 2017, Cheng et al. 2016);

140 which have been previously recovered from Browning Peninsula (Table 3.3)

141 (Pudasaini et al. 2017), Mitchell Peninsula and Robinson Ridge (Nicetic 2016). These

142 species showed little antimicrobial activity against the pathogens tested here (Table

143 3.8).

144

145 5.2.3 Biosynthetic gene clusters in Antarctic bacteria highlight survival strategies

146

147 "Everything is everywhere: but the environment selects"

148 - L. Becking (O'Malley 2007).

149

150 Overall, BGCs detected in the Antarctic bacterial genomes emphasise the diverse ecological

151 roles of secondary metabolites, most prominently those related to survival and nutrient

152 acquisition (Appendix Table A3.2). Specifically, putative BGCs encoded compounds

153 involved in osmotic stress reduction (e.g. ectoine) (Vicente et al. 2018), membrane fluidity

154 (e.g. sterols, hopene, carotenoids, PUFAs) (Nett et al. 2009, Seipke & Loria 2009), protection

155 from UV radiation (e.g. carotenoids, melanins) (Walter & Strack 2011, Plonka 2006), aerial

156 mycelia and cyst development (e.g. AmfS, SapB, alkylresorcinol) (Funa et al. 2006), and iron

157 acquisition (e.g. siderophores) (Barona-Gómez et al. 2004) (Section 4.4). While these BGC

158 families are not unique to bacteria from arid polar soils, the secondary metabolites they

191

159 encode may play a vital role in increasing fitness for producing species in this hostile

160 environment.

161

162 5.2.3.1 Carotenoids, siderophores and biosurfactants

163 Carotenoids and siderophores are two of the most important families of secondary

164 metabolites produced by microbes, both are also taxonomically and geographically

165 widespread (Cimermancic et al. 2014, Walter & Strack 2011). Carotenoids serve a multitude

166 of functions; in photoprotection, as well as assisting in membrane fluidity and as accessory

167 pigments in phototrophy (Section 3.4). They have also been suggested to occur frequently in

168 cold-adapted bacteria (Peeters et al. 2011, Baraúna et al. 2017, De Maayer et al. 2014,

169 Koblížek & Brussaard 2015, Dieser et al. 2010). Here, 50% of all cultured isolates displayed

170 carotenoid-like pigmentation (Fig. 3.16), and 12 of the 17 genomes contained at least one

171 known carotenoid BGC (Appendix Table A3.2). These included clusters encoding for

172 isorenieratene, found in the Streptomyces spp.; astaxanthin dideoxyglycoside detected in the

173 Sphingomonas and Novosphingobium strains; and sioxanthin in the Quadrisphaera isolate.

174 Carotenoids are produced through terpene pathways (Walter & Strack 2011, Eisenreich et al.

175 2004), and terpenes were the largest group of BGCs found in the Antarctic bacterial genomes

176 (Fig. 4.6). Furthermore, seven terpene BGCs remained uncharacterised, indicating potential

177 for novel compounds (Appendix Table A3.2).

178

179 Overall, seven of the seventeen bacterial genomes carried at least one siderophore BGC

180 (Appendix Table A3.2). Without exception, Streptomyces are known to harbour

181 siderophores, such as desferrioxamine (van der Heul et al. 2018). This was also found here,

182 with all three Antarctic Streptomyces genomes harbouring a highly similar cluster (Appendix 192

183 Table A3.2). Additionally, in amplicon sequencing in Chapter 2, four siderophore pathways

184 were revealed in NRPS analysis (Appendix Table A1.2). Little is currently known regarding

185 siderophore production by Antarctic soil bacteria (De Serrano et al. 2016), however in studies

186 examining hot deserts and other soil microbiomes, siderophores have been implicated in rock

187 weathering (Liermann et al. 2000, Adams et al. 1992, Ahmed & Holmström 2015). Microbes

188 including Streptomyces are known to form attachments to mineral surfaces in the

189 environment, and siderophore production has been found to increase mineral dissolution

190 rates, providing a source of essential metals for uptake by the microbial community (Ahmed

191 & Holmström 2015, Liermann et al. 2000, Choe et al. 2018). In Chapter 3, microbial

192 attachment to mineral surfaces was observed during DSC methods (Figs. 3.9, 3.10).

193 Previously, it has been proposed that siderophore biosynthesis provides a competitive edge

194 in iron-limited soils (Galet et al. 2015), and other species, such as Pseudomonas, have

195 evolved the capability to pirate siderophores produced by others (Galet et al. 2015).

196

197 Of interest, a connectedness has been reported between the secretion of biosurfactants and

198 siderophore nutrient acquisition during mineral weathering, whereby microbial communities

199 attach to mineral surfaces via biofilm formation (Ahmed & Holmström 2015), in a process

200 known to involve both biosurfactants and siderophores (Paraszkiewicz et al. 2017, Yang et

201 al. 2012). In NRPS domain analyses in both Chapters 2 and 4, a high proportion of pathways

202 showed similarity to those encoding biosurfactant peptides such as syringomycin and

203 gramicidin (Fig. 2.8, Appendix Table A3.3). Biosurfactant metabolites exhibit highly

204 versatile ecological roles (Section 2.4), and are common in cold-adapted microorganisms

205 (Perfumo et al. 2018). Metal harvesting strategies from mineral surfaces may be of increased

206 importance in the extremely resource-limited Antarctic environment, leading to an

193

207 abundance of biosurfactant-like molecule pathways. Biosurfactant production in these

208 isolates remains to be determined, for which a number of rapid screening methods could be

209 employed (Sarwar et al. 2018). To further investigate the role of biosurfactants in mineral

210 weathering, bacterial strains could be examined for their ability to mobilise metals from

211 crushed mineral into solution (Becerra-Castro et al. 2013), and re-tested following disruption

212 to biosurfactant synthesis genes.

213

214 5.2.3.2 Long-chain polyunsaturated fatty acids

215 In bacteria, LC-PUFAs have been found almost exclusively in Gram-negative psychrophilic

216 marine Gammaproteobacteria, such as Shewanella and Colwellia spp. (Shulse & Allen 2011,

217 Bianchi et al. 2014). Here, isolate Azospirillum INR13 was found to harbour a BGC highly

218 similar to the LC-PUFA synthase of another terrestrial bacterium, Aetherobacter (Fig. 4.14)

219 (Gemperlein et al. 2014). This is an exciting prospect, as LC-PUFA have not been previously

220 reported from Azospirillum. These unusual secondary lipids are synthesised through Type I

221 PKS-like systems, and are primarily sourced from microalgae, cold climate fish and

222 invertebrates. They have substantial biotechnological value as nutritional supplements

223 because many organisms must obtain PUFA through diet (Bianchi et al. 2014, Nichols et al.

224 2010b, Sprague et al. 2016). Omega-3 PUFA, such as EPA and DHA (Fig.4.14) are given to

225 farmed salmon as a feed additive, and the enormous global demand for salmon has led to a

226 shortage of lipid supply. Consequently, over the course of a decade, a halving of EPA and

227 DHA levels in farmed fish has been reported (Sprague et al. 2016). Interestingly, genomic

228 analyses of diverse bacterial lineages have found a range of PUFA-like genes widespread

229 amongst bacteria not known to produce PUFA, suggesting that HGT events may have

230 contributed to their dissemination (Shulse & Allen 2011). They include some Actinobacterial 194

231 genera: Rhodococcus, Frankia and Streptomyces; and here too, PUFA-like domains were

232 identified in Streptomyces INR7 (Appendix Table A3.4), situated within uncharacterised

233 PKS BGCs.

234

235 5.2.4 Antarctic soil bacteria contain an abundance of uncharacterised biosynthetic

236 domains

237 The high level of novelty in gene sequences and clusters revealed in Antarctic bacteria here

238 indicate that eastern Antarctic desert soils are exciting targets for novel NP bioprospecting.

239 Overall, ~89% of all PKS and NRPS domain sequences from Chapters 2 and 4 were novel

240 (Sections 2.3.7 & 4.3.7). While this prevented final compound predictions, the inferred

241 functional subtypes predominantly encoded for polyene macrolides (e.g. nystatin),

242 macrocyclic lactones (e.g. avermectin, epothilone), lipopeptides (e.g. syringomycin) and

243 enediynes (e.g. C-1027) (Figs. 2.7, 2.8, 4.15, 4.16) (Ziemert et al. 2012, Rausch et al. 2007).

244 Prediction of biosynthetic pathway end-products is facilitated by a number of available

245 bioinformatics tools, such as antiSMASH and NaPDoS (Blin et al. 2017b, Ziemert et al.

246 2012). For the well-studied mega-synthases such as PKS and NRPS, the task is aided by

247 cluster characteristics including high conservation, modularity, and the tendency of domains

248 to cluster phylogenetically by functional subtype (Rausch et al. 2007, Ziemert et al. 2012,

249 Roongsawang et al. 2011, Medema et al. 2014). The NaPDoS database, while not all-

250 encompassing, is curated to contain representatives from all major classes of Type I and II

251 PKS KS domains and NRPS C domains (Ziemert et al. 2012).

252

195

253 5.2.5 Eukaryotic cells are targeted by many of the predicted biosynthetic pathways

254 In the eastern Antarctic desert soils analysed here, prokaryotes vastly outnumber eukaryotes,

255 with fungal diversity in particular being surprisingly low (Section 1.7) (Zhang et al. 2019, Ji

256 et al. 2017, Ferrari et al. 2015). Intriguingly, in both amplicon sequencing and BGC analysis

257 (Chapters 2 and 4), the majority of predicted biosynthetic pathways encoded chemical classes

258 with activity against eukaryotic cells; namely antifungals (candicidin, antimycin, nystatin),

259 antitumour compounds (tambromycin, C-1027, epothilone, bleomycin, actinomycin),

260 antiparasitics (avermectin, cyclomarin) and biosurfactants (syringomycin, iturin, SapB,

261 surfactin) (Figs. 2.7, 2.8, 4.15, 4.16, Appendix Table A3.2) (Ziemert et al. 2012, Rausch et

262 al. 2007). In Chapter 2 these pathways comprised 63% of the domain sequences matching

263 known BGCs (Appendix Table A1.1 & A1.2), and in Chapter 4, 64% of BGCs with similarity

264 to known clusters (Appendix Table A3.2). Furthermore, in phylogenetic analysis of C and

265 KS domains in Chapter 4, 84% of domains aligned with eukaryotic-acting compound

266 pathways (Appendix Table A3.3-A3.6).

267

268 In bioactivity assays, antifungal activity against Candida albicans was confirmed for

269 Streptomyces isolates NBH77, which was predicted to harbour the candicidin BGC (Fig.

270 4.10); and also in Streptomyces INR7 (Fig. 4.7), which carried a BGC similar to that for

271 streptothricin, whose derivatives such as streptothricin E, have demonstrated activity against

272 Candida (Gan et al. 2011). Further work is required to confirm the isolate's biosynthesis of

273 these compounds.

274

275 Interactions between resident microbiota and environmental factors are unquestionably

276 complex. Fungi, despite showing high tolerance to low temperatures and dryness (Sun et al.

196

277 2017), are less successful in arid polar deserts (Zhang et al. 2019, Ji et al. 2017, Ferrari et al.

278 2015). This has primarily been attributed to differences in carbon cycling between the

279 kingdoms, where fungi are suggested to be the dominant decomposers of organic litter and

280 more commonly form symbiotic relationships with plants than bacteria do (Sun et al. 2017,

281 Bahram et al. 2018). As plants are virtually non-existent in eastern Antarctica, and organic

282 carbon is extremely low, edaphic factors may explain the fungal minority. Why then, would

283 bacteria need to harbour such a capacity for fungi-targeted chemical warfare? The results of

284 this research suggest that the production of a variety of eukaryotic-acting secondary

285 metabolites may contribute to their abilities to out-compete fungi and other micro-eukarya

286 present in these soils. Investigation of this hypothesis could begin with bioactivity assays

287 determining the susceptibility of indigenous fungi to compounds produced by these bacterial

288 isolates. Further, soil microcosms could be used to assess the effects of competition within

289 the Antarctic soil communities, via complete removal of bacteria and measurement of

290 changes to eukaryotic abundance and diversity (Hicks et al. 2019).

291

292 5.2.6 Long-read sequencing for natural product domain amplicon and genomic BGC

293 analysis

294 Long-read sequencing technology was employed here for the first time to survey soil

295 microbiomes for NP domain tag-sequences (Chapter 2). The approach successfully captured

296 full-length amplicons of PKS and NRPS domain fragments targeted with the primer sets

297 employed here (~700 and ~1200 bp). These lengths are currently unachievable with Illumina

298 MiSeq (Goodwin et al. 2016). Amplicon sequencing has previously been employed to survey

299 natural product genes across diverse microbiomes using SGS approaches (Woodhouse et al.

300 2013, Charlop-Powers et al. 2014, Charlop-Powers et al. 2016, Katz et al. 2016, Lemetre et

197

301 al. 2017). As with all techniques, amplicon sequencing is not without bias (Krehenwinkel et

302 al. 2017). Minimisation strategies can be incorporated into study design, and here we

303 included the use of degenerate primers and high-fidelity polymerase (Stasik et al. 2018,

304 Krehenwinkel et al. 2017). Importantly, validation was given to the phylogenetic and

305 functional inferences made through amplicon analysis in Chapter 2, through genomic BGC

306 domain analysis in Chapter 4. Specifically, similar levels of novelty were revealed, in

307 addition to the abundance of particular functional subtypes.

308

309 PCR-independent long-read metagenomic approaches are becoming increasingly accessible,

310 and continual improvements to the quality of ultra-long-reads produced by platforms such as

311 ONT (> 1Mb) will undoubtably lead to major advances in microbial diversity analyses in the

312 future. We conclude that our sequencing approach was an advance for the analysis of large

313 gene fragments such as PKS and NRPS, enabling direct comparison of nucleotide and

314 translated protein sequences (Payne et al. 2018).

315

316 5.3 FUTURE DIRECTIONS

317 Antarctic bacteria harbour a wealth of poorly characterised biosynthetic pathways, with

318 potential for production of medically and industrially valuable novel compounds related to

319 antifungal macrolides, antitumour agents, biosurfactant peptides, siderophores, carotenoids

320 and PUFAs. Multiple paths of further investigation can stem from these findings.

321

322 The first priority could be given to the proverbial 'lowest hanging fruit', such as the putative

323 PUFA synthase in Azospirillum. Here, fatty acid profiling could be performed by GC–MS

324 analysis of the fatty acid methyl esters (FAMEs) to confirm production of EPA and DHA 198

325 (Gemperlein et al. 2014). If demonstrated, this isolate would represent the first known fast-

326 growing, sustainable bacterial sources of these in-demand lipids.

327

328 For bioactive NPs with medical value, four Actinomycetales isolates showed particular

329 promise for future investigation: Streptomyces spp. INR7, NBH77 and NBSH44, and

330 Kribbella SPB151, all of which were revealed to harbour uncharacterised or low-similarity

331 Type I PKS- and NRPS-containing BGCs (Appendix Table A3.2). The Kribbella and

332 Streptomyces NBSH44 isolates contained a greater proportion of uncharacterised and low

333 gene similarity clusters in comparison to the other Streptomyces: INR7 and NBH77 (Section

334 4.3.6.6); therefore these isolates should be prioritised for further work. Additionally, the

335 Gram-negative antibacterial activity displayed by Streptomyces isolate INR7 is important in

336 regards to the global antimicrobial resistance crisis (WHO, 2014), and encourages further

337 activity assays incorporating resistant strains. Here, the next steps would be fermentation,

338 solvent extraction and compound characterisation using LC-MS and NMR, followed by

339 activity assays using both crude and purified extracts (Ling et al. 2015). Notably, this research

340 has revealed gene similarities to > 20 different antitumour-encoding clusters in bacteria from

341 Antarctica (Appendix Tables A1.1, A1.2, A3.2). An exciting opportunity for further work

342 therefore lies in bioassays against various tumour cell lines, to inform on the cytotoxic

343 capabilities of the isolates (Olano et al. 2009). Further, because many BGCs may remain

344 silent under laboratory conditions, and cryptic pathways may prove the most worthy targets

345 for novel bioactive production, heterologous expression may be an attractive approach for

346 awakening and targeting specific uncharacterised BGCs (Nah et al. 2017).

347

199

348 Overall, the findings of this research indicate that the desert soils of eastern Antarctica are

349 indeed excellent candidates for novel natural product bioprospecting, and offer insight into

350 the functional and ecological relevance of secondary metabolites regarding both competition

351 between microbiota, as well as strategies for survival in soils where resources are highly

352 limited.

200

353

201

REFERENCES

Abraham, E. P. & Chain, E. (1940). An Enzyme from Bacteria able to Destroy Penicillin. Nature. 146:837-837.

Adamek, M., Alanjary, M., Sales-Ortells, H., Goodfellow, M., Bull, A. T., Winkler, A., Wibberg, D., Kalinowski, J. & Ziemert, N. (2018). Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species. BMC genomics. 19(1):426.

Adams, J. B., Palmer, F. & Staley, J. T. (1992). Rock weathering in deserts: Mobilization and concentration of ferric iron by microorganisms. Geomicrobiology Journal. 10(2):99-114.

Ageitos, J. M., Sánchez-Pérez, A., Calo-Mata, P. & Villa, T. G. (2017). Antimicrobial peptides (AMPs): Ancient compounds that represent novel weapons in the fight against bacteria. Biochemical Pharmacology. 133:117-138.

Ahmed, E. & Holmström, S. J. M. (2015). Microbe–mineral interactions: The impact of surface attachment on mineral weathering and element selectivity by microorganisms. Chemical Geology. 403:13-23.

Aislabie, J. M., Balks, M. R., Foght, J. M. & Waterhouse, E. J. (2004). Hydrocarbon Spills on Antarctic Soils: Effects and Management. Environmental Science & Technology. 38(5):1265-1274.

Aislabie, J. M., Jordan, S. & Barker, G. M. (2008). Relation between soil classification and bacterial diversity in soils of the Ross Sea region, Antarctica. Geoderma. 144(1-2):9-20.

Alain, K. & Querellou, J. (2009). Cultivating the uncultured: limits, advances and future challenges. Microbial Life Under Extreme Conditions. 13(4):583-594.

Aleti, G., Nikolić, B., Brader, G., Pandey, R. V., Antonielli, L., Pfeiffer, S., Oswald, A. & Sessitsch, A. (2017). Secondary metabolite genes encoded by potato rhizosphere microbiomes in the Andean highlands are diverse and vary with sampling site and vegetation stage. Scientific Reports. 7(1):2330.

Aliyu, H., De Maayer, P., Cowan, D. & Wagner, D. (2016). The genome of the Antarctic polyextremophile Nesterenkonia sp. AN1 reveals adaptive strategies for survival under multiple stress conditions. FEMS Microbiology Ecology. 92(4):fiw032.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). Basic local alignment search tool. Journal of molecular biology. 215(3):403.

Amann, R., Ludwig, W. & Schleifer, K. (1995). Phylogenetic identification and in-situ detection of individual microbial-cells without cultivation. Microbiological Reviews. 59:143- 169.

202

Aminov, R. I. & Mackie, R. I. (2007). Evolution and ecology of antibiotic resistance genes. FEMS Microbiology Letters. 271:147-161.

Andreoni, V., Cavalca, L., Rao, M. A., Nocerino, G., Bernasconi, S., Dell’Amico, E., Colombo, M. & Gianfreda, L. (2004). Bacterial communities and enzyme activities of PAHs polluted soils. Chemosphere. 57(5):401-412.

Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data [Online]. Available: http://www.bioinformatics.babraham.ac.uk/projects/fastqc [Accessed May 2016].

Aparicio, J. F. (2003). Polyene antibiotic biosynthesis gene clusters. Applied Microbiology and Biotechnology. 61(3):179-188.

Aparicio, J. F., Barreales, E. G., Payero, T. D., Vicente, C. M., de Pedro, A. & Santos- Aberturas, J. (2016). Biotechnological production and application of the antibiotic pimaricin: biosynthesis and its regulation. Applied Microbiology and Biotechnology. 100(1):61-78.

Arendrup, M. C. & Patterson, T. F. (2017). Multidrug-Resistant Candida: Epidemiology, Molecular Mechanisms, and Treatment. The Journal of Infectious Diseases. 216(suppl_3):S445-S451.

Austin, M. B. & Noel, J. P. (2003). The chalcone synthase superfamily of type III polyketide synthases. Natural Product Reports. 20(1):79-110.

Australian Antarctic Data Centre (AADC). (2017). Map Catalogue [Online]. Available: https://data.aad.gov.au/aadc/mapcat/search_mapcat.cfm [Accessed March 2017].

Australian Antarctic Data Centre (AADC). (2018). Gazetteer [Online]. Available: https://data.aad.gov.au/aadc/gaz/ [Accessed Sept 2018].

Australian Antarctic Division (AAD). (2002). About Antarctica [Online]. Available: http://www.antarctica.gov.au/about-antarctica [Accessed April 2016].

Ayuso-Sacido, A. & Genilloud, O. (2005). New PCR primers for the screening of NRPS and PKS-I systems in actinomycetes: Detection and distribution of these biosynthetic gene sequences in major taxonomic groups. Microbial Ecology. 49(1):10-24.

Babalola, O. O., Kirby, B. M., Le Roes-Hill, M., Cook, A. E., Cary, S. C., Burton, S. G. & Cowan, D. A. (2009). Phylogenetic analysis of actinobacterial populations associated with Antarctic Dry Valley mineral soils. Environmental Microbiology. 11(3):566-576.

Bahram, M., Hildebrand, F., Forslund, S. K., Anderson, J. L., Soudzilovskaia, N. A., Bodegom, P. M., Bengtsson-Palme, J., Anslan, S., Coelho, L. P., Harend, H., Huerta- Cepas, J., Medema, M. H., Maltz, M. R., Mundra, S., Olsson, P. A., Pent, M., Põlme, S., Sunagawa, S., Ryberg, M., Tedersoo, L. & Bork, P. (2018). Structure and function of the global topsoil microbiome. Nature. 560(7717):233. 203

Bailey, B. T., Morgan, P. J. & Lackie, M. A. (2016). An assessment of the gravity signature of the Windmill Islands, East Antarctica. Antarctic science. 28(2):115-126.

Baker, K., Dallman, T., Field, N., Childs, T., Mitchell, H., Day, M., Weill, F.-X., Lefèvre, S., Tourdjman, M., Hughes, G., Jenkins, C. & Thomson, N. (2018). Horizontal antimicrobial resistance transfer drives epidemics of multiple Shigella species. Nature Communications. 9(1):1-10.

Baldim, J. L., da Silva, B. L., Chagas-Paula, D. A., Lago, J. H. G. & Soares, M. G. (2017). A strategy for the identification of patterns in the biosynthesis of nonribosomal peptides by Betaproteobacteria species. Scientific Reports. 7(1):1-11.

Balks, M. R., Paetzold, R. F., Kimble, J. M., Aislabie, J. & Campbell, I. B. (2002). Effects of hydrocarbon spills on the temperature and moisture regimes of Cryosols in the Ross Sea region. Antarctic science. 14(4):319-326.

Baltz, R. (2017). Gifted microbes for genome mining and natural product discovery. Official Journal of the Society for Industrial Microbiology and Biotechnology. 44(4):573-588.

Baltz, R. H. (2007). Antimicrobials from actinomycetes: Back to the future. Microbe. 2(3):125-131.

Banik, J. & Brady, S. (2010). Recent application of metagenomic approaches toward the discovery of antimicrobials and other bioactive small molecules. Current Opinion in Microbiology. 13:603-609.

Baraúna, R., Freitas, D., Pinheiro, J., Folador, A. & Silva, A. (2017). A Proteomic Perspective on the Bacterial Adaptation to Cold: Integrating OMICs Data of the Psychrotrophic Bacterium Exiguobacterium antarcticum B7. Proteomes. 5(1):9.

Barka, E., Vatsa, P., Sanchez, L., Gaveau-Vaillant, N., Jacquard, C., Klenk, H.-P., Clément, C., Ouhdouch, Y. & van Wezel, G. (2016). Taxonomy, Physiology, and Natural Products of Actinobacteria. Microbiology and Molecular Biology Reviews. 80(1):1.

Barnard, R. L., Osborne, C. A. & Firestone, M. K. (2013). Responses of soil bacterial and fungal communities to extreme desiccation and rewetting. The ISME Journal. 7(11):2229- 2241.

Barona-Gómez, F., Wong, U., Giannakopulos, A. E., Derrick, P. J. & Challis, G. L. (2004). Identification of a cluster of genes that directs desferrioxamine biosynthesis in Streptomyces coelicolor M145. Journal of the American Chemical Society. 126(50):16282- 16283.

Barry, R. G. & Hall-McKim, E. A. (2018). Polar Environments and Global Change, Cambridge University Press.

204

Bates, D., Mächler, M., Bolker, B. M. & Walker, S. C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software. 67(1):1-48.

Battistuzzi, F. U. & Hedges, S. B. (2009). A Major Clade of Prokaryotes with Ancient Adaptations to Life on Land. Molecular Biology and Evolution. 26(2):335-343.

Becerra-Castro, C., Kidd, P., Kuffner, M., Prieto-Fernández, Á., Hann, S., Monterroso, C., Sessitsch, A., Wenzel, W. & Puschenreiter, M. (2013). Bacterially Induced Weathering of Ultramafic Rock and Its Implications for Phytoextraction. Applied and Environmental Microbiology. 79(17):5094-5103.

Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. (2011). GenBank. Nucleic acids research. 39:D32-37.

Bérdy, J. (2005). Bioactive microbial metabolites: A personal view. Journal of Antibiotics. 58(1):1-26.

Beveridge, T. (2001). Use of the Gram stain in microbiology. Biotechnic & Histochemistry. 76(3):111-118.

Bianchi, A., Olazábal, L., Torre, A. & Loperena, L. (2014). Antarctic microorganisms as source of the omega-3 polyunsaturated fatty acids. World Journal of Microbiology and Biotechnology. 30(6):1869-78.

Bissett, A., Fitzgerald, A., Meintjes, T., Mele, P. M., Reith, F., Dennis, P. G., Breed, M. F., Brown, B., Brown, M. V., Brugger, J., Byrne, M., Caddy-Retalic, S., Carmody, B., Coates, D. J., Correa, C., Ferrari, B. C., Gupta, V. V. S. R., Hamonts, K., Haslem, A., Hugenholtz, P., Karan, M., Koval, J., Lowe, A. J., Macdonald, S., McGrath, L., Martin, D., Morgan, M., North, K. I., Paungfoo-Lonhienne, C., Pendall, E., Phillips, L., Pirzl, R., Powell, J. R., Ragan, M. A., Schmidt, S., Seymour, N., Snape, I., Stephen, J. R., Stevens, M., Tinning, M., Williams, K., Yeoh, Y. K., Zammit, C. M. & Young, A. (2016). Introducing BASE: The Biomes of Australian Soil Environments soil microbial diversity database. GigaScience. 5(1):1-11.

Blagodatskaya, E. & Kuzyakov, Y. (2013). Active microorganisms in soil: Critical review of estimation criteria and approaches. Soil Biol. Biochem. 67:192-211.

Blanc, G., Agarkova, I., Grimwood, J., Kuo, A., Brueggeman, A., Dunigan, D. D., Gurnon, J., Ladunga, I., Lindquist, E., Lucas, S., Pangilinan, J., Pröschold, T., Salamov, A., Schmutz, J., Weeks, D., Yamada, T., Lomsadze, A., Borodovsky, M., Claverie, J.- M., Grigoriev, I. V. & Van Etten, J. L. (2012). The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation. Genome biology. 13(5):R39-R39.

Blin, K., Kim, H. U., Medema, M. H. & Weber, T. (2017a). Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Briefings in Bioinformatics.bbx146, https://doi.org/10.1093/bib/bbx146.

205

Blin, K., Shaw, S., Steinke, K., Villebro, R., Ziemert, N., Lee, S. Y., Medema, M. H. & Weber, T. (2019). antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic acids research.gkz310, https://doi.org/10.1093/nar/gkz310.

Blin, K., Wolf, T., Chevrette, M. G., Lu, X., Schwalen, C. J., Kautsar, S. A., Suarez Duran, H. G., de los Santos, Emmanuel L. C., Kim, H. U., Nave, M., Dickschat, J. S., Mitchell, D. A., Shelest, E., Breitling, R., Takano, E., Lee, S. Y., Weber, T. & Medema, M. H. (2017b). antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification. Nucleic acids research. 45(W1):W36-W41.

Borchert, E., Jackson, S. A., O’Gara, F. & Dobson, A. D. W. (2016). Diversity of Natural Product Biosynthetic Genes in the Microbiome of the Deep Sea Sponges Inflatella pellicula, Poecillastra compressa, and Stelletta normani. Frontiers in Microbiology. 7(1027).

Borsetto, C., Amos, G. C. A., da Rocha, U. N., Mitchell, A. L., Finn, R. D., Laidi, R. F., Vallin, C., Pearce, D. A., Newsham, K. K. & Wellington, E. M. H. (2019). Microbial community drivers of PK/NRP gene diversity in selected global soils. Microbiome. 7(1).

Bowman, J. P., Abell, G. C. J. & Nichols, C. A. M. (2005). Psychrophilic Extremophiles from Antarctica: Biodiversity and Biotechnological Potential. Ocean and Polar Research. 27(2):221-230.

Bowman, J. P., McCammon, S. A., Brown, J. L. & McMeekin, T. A. (1998). Glaciecola punicea gen. nov., sp. nov. and Glaciecola pallidula gen. nov., sp. nov.: psychrophilic bacteria from Antarctic sea-ice habitats. International Journal of Systematic Bacteriology. 48(4):1213-1222.

Brockman, E. R. & Boyd, W. L. (1963). Myxobacteria from soils of the Alaskan and Canadian Arctic. The Journal of Bacteriology. 86(3):605.

Brooijmans, R. J. W., Pastink, M. I. & Siezen, R. J. (2009). Hydrocarbon-degrading bacteria: the oil-spill clean-up crew. Microbial Biotechnology. 2(6):587-594.

Bruns, H., Crusemann, M., Letzel, A., Alanjary, M., McInerney, J., Jensen, P., Schulz, S., Moore, B. S. & Ziemert, N. (2018). Function-related replacement of bacterial siderophore pathways. The ISME Journal. 12(2):320-329.

Burg, R. W. (1979). Avermectins, new family of potent anthelmintic agents: Producing organism and fermentation. Antimicrobial agents and chemotherapy. 15(3):361-367.

Burja, A., Banaigs, B., Abou-Mansour, E., Burgess, J. & Wright, P. C. (2001). Marine cyanobacteria - a prolific source of natural products. Tetrahedron. 57:9347-9377.

Busti, E. (2006). Antibiotic-producing ability by representatives of a newly discovered lineage of actinomycetes. Microbiology. 152(3):675-683.

Butler, M. S., Blaskovich, M. A. & Cooper, M. A. (2013). Antibiotics in the clinical pipeline in 2013. The Journal of Antibiotics. 66(10):571-591. 206

Caffrey, P., De Poire, E., Sheehan, J. & Sweeney, P. (2016). Polyene macrolide biosynthesis in streptomycetes and related bacteria: recent advances from genome sequencing and experimental studies. Applied Microbiology and Biotechnology. 100(9):3893-3908.

Calteau, A., Fewer, D. P., Latifi, A., Coursin, T., Laurent, T., Jokela, J., Kerfeld, C. A., Sivonen, K., Piel, J. & Gugger, M. (2014). Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria. BMC genomics. 15(1):977.

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K. & Madden, T. (2009). BLAST plus: architecture and applications. BMC Bioinformatics. 10(1):421.

Campbell, I. B. & Claridge, G. G. C. (1987a). The Biology of Antarctic soils. Developments in Soil Science. ISBN 9780444427847: Elsevier.

Campbell, I. B. & Claridge, G. G. C. (1987b). The Climate of Antarctica. Developments in Soil Science. ISBN 9780444427847: Elsevier.

Cane, D. E. (2010). Programming of erythromycin biosynthesis by a modular polyketide synthase. The Journal of Biological Chemistry. 285(36):27517.

Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., Fierer, N., Pẽa, A. G., Goodrich, J. K., Gordon, J. I., Huttley, G. A., Kelley, S. T., Knights, D., Koenig, J. E., Ley, R. E., Lozupone, C. A., McDonald, D., Muegge, B. D., Pirrung, M., Reeder, J., Sevinsky, J. R., Turnbaugh, P. J., Walters, W. A., Widmann, J., Yatsunenko, T., Zaneveld, J. & Knight, R. (2010). QIIME allows analysis of high- throughput community sequencing data. Nature Methods. 7(5):335-336.

Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Turnbaugh, P. J., Fierer, N. & Knight, R. (2011). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences of the United States of America. 108:4516-4522.

Carvajal, F. (1947). Screening Tests for Antibiotics. Mycologia. 39(1):128-130.

Cary, S. C., McDonald, I. R., Barrett, J. E. & Cowan, D. A. (2010). On the rocks: The microbiology of Antarctic Dry Valley soils. Nature Reviews: Microbiology. 8(2):129-138.

Casanueva, A., Tuffin, M., Cary, C. & Cowan, D. A. (2010). Molecular adaptations to psychrophily: the impact of ‘omic’ technologies. Trends in Microbiology. 18(8):374-381.

Castresana, J. (2000). Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular Biology and Evolution. 17(4):540.

207

Centers for Disease Control and Prevention (CDC). (2013). Antibiotic resistance threats in the United States [Online]. Available: https://www.cdc.gov/drugresistance/threat-report- 2013/pdf/ar-threats-2013-508.pdf [Accessed June 2018].

Challis, G. L. (2008). Genome mining for novel natural product discovery. Journal of medicinal chemistry. 51(9):2618.

Charlop-Powers, Z., Owen, J. G., Reddy, B. V. B., Ternei, M. A. & Brady, S. F. (2014). Chemical-biogeographic survey of secondary metabolism in soil. Proceedings of the National Academy of Sciences of the United States of America. 111(10):3757-3762.

Charlop-Powers, Z., Owen, J. G., Reddy, B. V. B., Ternei, M. A., Guimarães, D. O., de Frias, U. A., Pupo, M. T., Seepe, P., Feng, Z. & Brady, S. F. (2015). Global biogeographic sampling of bacterial secondary metabolism. eLife. 4(4):e05048.

Charlop-Powers, Z., Pregitzer, C. C., Lemetre, C., Ternei, M. A., Maniko, J., Hover, B. M., Calle, P. Y., McGuire, K. L., Garbarino, J., Forgione, H. M., Charlop-Powers, S. & Brady, S. F. (2016). Urban park soil microbiomes are a rich reservoir of natural product biosynthetic diversity. Proceedings of the National Academy of Sciences of the United States of America. 113(51):14811-14816.

Chattopadhyay, M. & Jagannadham, M. (2001). Maintenance of membrane fluidity in Antarctic bacteria. Polar Biology. 24(5):386-388.

Chen, H. & Boutros, P. C. (2011). VennDiagram: a package for the generation of highly- customizable Venn and Euler diagrams in R. BMC Bioinformatics. 12(1):35-35.

Chen, Y.-C., Liu, T., Yu, C.-H., Chiang, T.-Y., Hwang, C.-C. & Xu, Y. (2013). Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly. PloS one. 8(4):e62856.

Cheng, K., Rong, X. & Huang, Y. (2016). Widespread interspecies homologous recombination reveals reticulate evolution within the genus Streptomyces. Molecular Phylogenetics and Evolution. 102:246-254.

Cheng, Y.-Q., Tang, G.-L. & Ben, S. (2003). Type I polyketide synthase requiring a discrete acyltransferase for polyketide biosynthesis. Proceedings of the National Academy of Sciences of the United States of America. 100(6):3149-3154.

Chin, C.-S., Alexander, D. H., Marks, P., Klammer, A. A., Drake, J., Heiner, C., Clum, A., Copeland, A., Huddleston, J., Eichler, E. E., Turner, S. W. & Korlach, J. (2013). Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods. 10(6):563-569.

Chintalapati, S. (2004). Role of membrane lipid fatty acids in cold adaptation. Cellular and Molecular Biology. 50(5):631-642.

208

Choe, Y.-H., Kim, M., Woo, J., Lee, M. J., Lee, J. I., Lee, E. J. & Lee, Y. K. (2018). Comparing Rock-inhabiting Microbial Communities in Different Rock Types from a High Arctic Polar Desert. FEMS Microbiology Ecology. 94(6):fiy070.

Chong, C.-W., Pearce, D. A. & Convey, P. (2015). Emerging spatial patterns in Antarctic prokaryotes. Frontiers in microbiology. 6(SEP).

Chong, C., Annie Tan, G., Wong, R., Riddle, M. & Tan, I. (2009). DGGE fingerprinting of bacteria in soils from eight ecologically different sites around Casey Station, Antarctica. Polar Biology. 32(6):853-860.

Chong, C. W., Pearce, D. A., Convey, P. & Tan, I. K. P. (2012). The identification of environmental parameters which could influence soil bacterial community composition on the Antarctic Peninsula - a statistical approach. Antarctic science. 24(3):249-258.

Čihák, M., Kameník, Z., Šmídová, K., Bergman, N., Benada, O., Kofroňová, O., Petříčková, K. & Bobek, J. (2017). Secondary Metabolites Produced during the Germination of Streptomyces coelicolor. Frontiers in Microbiology. 8(2495).

Cimermancic, P., Medema, Marnix H., Claesen, J., Kurita, K., Wieland Brown, Laura C., Mavrommatis, K., Pati, A., Godfrey, Paul A., Koehrsen, M., Clardy, J., Birren, Bruce W., Takano, E., Sali, A., Linington, Roger G. & Fischbach, Michael A. (2014). Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters. Cell. 158(2):412-421.

Clardy, J., Fischbach, M. A. & Walsh, C. T. (2006). New antibiotics from bacterial natural products. Nature Biotechnology. 24(12):1541-1550.

Clarke, K. R. & Gorley, R. N. ( 2015). PRIMER v7: User Manual/Tutorial Plymouth, UK, PRIMER-E.

Couso, I., Vila, M., Vigara, J., Cordero, B. F., Vargas, M. Á., Rodríguez, H. & León, R. (2012). Synthesis of carotenoids and regulation of the carotenoid biosynthesis pathway in response to high light stress in the unicellular microalga Chlamydomonas reinhardtii. European journal of phycology. 47(3):223-232.

Cowan, D. A., Makhalanyane, T. P., Dennis, P. G. & Hopkins, D. W. (2014). Microbial ecology and biogeochemistry of continental Antarctic soils. Frontiers in Microbiology. 5(154).

Crits-Christoph, A., Diamond, S., Butterfield, C. N., Thomas, B. C. & Banfield, J. F. (2018). Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis. Nature. 558(7710):440-444.

Cuadrat, R. R. C., Ionescu, D., Davila, A. & Grossart, H. (2018). Recovering Genomics Clusters of Secondary Metabolites from Lakes Using Genome-Resolved Metagenomics. Front. Microbiol. 9(251).

209

D’costa, V. M., King, C. E., Kalan, L., Morar, M., Sung, W. W. L., Schwarz, C., Froese, D., Zazula, G., Calmels, F., Debruyne, R., Golding, G. B., Poinar, H., N. & Wright, G., D. (2011). Antibiotic resistance is ancient. Nature. 477:457-461.

Daniel-Ivad, M., Hameed, N., Tan, S., Dhanjal, R., Socko, D., Pak, P., Gverzdys, T., Elliot, M. A. & Nodwell, J. R. (2017). An Engineered Allele of afsQ1 Facilitates the Discovery and Investigation of Cryptic Natural Products. ACS chemical biology. 12(3):628- 634.

Das, A. & Khosla, C. (2009). Biosynthesis of Aromatic Polyketides in Bacteria. Accounts of chemical research. 42:631-639.

Davis, K. E. R., Joseph, S. J. & Janssen, P. H. (2005). Effects of Growth Medium, Inoculum Size, and Incubation Time on Culturability and Isolation of Soil Bacteria. Applied and environmental microbiology. 71(2):826.

Davison, J., Dorival, J., Rabeharindranto, H., Mazon, H., Chagot, B., Gruez, A. & Weissman, K. J. (2014). Insights into the function of trans -acyl transferase polyketide synthases from the SAXS structure of a complete module. Chemical Science. 5(8):3081- 3095.

Dawid, W. (2000). Biology and global distribution of myxobacteria in soils. FEMS Microbiology Review. 24:403-427.

Dawid, W., Gallikowski, C. & Hirsch, P. (1988). Psychrophilic myxobacteria from antarctic soils. Polarforschung. 58(2/3):271-278.

Daza, A., Martín, J. F., Dominguez, A. & Gil, J. A. (1989). Sporulation of several species of Streptomyces in submerged cultures after nutritional downshift. Journal of general microbiology. 135(9):2483.

De Maayer, P., Anderson, D., Cary, C. & Cowan, D. A. (2014). Some like it cold: understanding the survival strategies of psychrophiles. EMBO Reports. 15:508-517. de Pascale, D., de Santi, C., Fu, J. & Landfald, B. (2012). The microbial diversity of Polar environments is a fertile ground for bioprospecting. Marine genomics. 8:15-22.

De Serrano, L. O., Camper, A. K. & Richards, A. M. (2016). An overview of siderophores for iron acquisition in microorganisms living in the extreme. BioMetals. 29(4):551-571.

Delgado-Baquerizo, M., Reith, F., Dennis, P. G., Hamonts, K., Powell, J. R., Young, A., Singh, B. K. & Bissett, A. (2018). Ecological drivers of soil microbial diversity and soil biological networks in the Southern Hemisphere. Ecology. 99(3):583-596.

Dereeper, A., Guignon, V., Blanc, G., Audic, S., Buffet, S., Chevenet, F., Dufayard, J. F., Guindon, S., Lefort, V., Lescot, M., Claverie, J. M. & Gascuel, O. (2008). Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic acids research. 36:465-469. 210

Dhakal, D., Pokhrel, A., Shrestha, B. & Sohng, J. (2017). Marine Rare Actinobacteria: Isolation, Characterization, and Strategies for Harnessing Bioactive Compounds. Front. Microbiol. 8(1106).

Dieser, M., Greenwood, M. & Foreman, C. M. (2010). Carotenoid Pigmentation in Antarctic Heterotrophic Bacteria as a Strategy to Withstand Environmental Stresses. Arctic, antarctic, and alpine research. 42(4):396-405.

Donadio, S., Monciardini, P. & Sosio, M. (2007). Polyketide synthases and nonribosomal peptide synthetases: The emerging view from bacterial genomics. Natural Product Reports. 24(5):1073-1079.

Driscoll, C. B., Otten, T. G., Brown, N. M. & Dreher, T. W. (2017). Towards long-read metagenomics: complete assembly of three novel genomes from bacteria dependent on a diazotrophic cyanobacterium in a freshwater lake co-culture. Standards in genomic sciences. 12(1):9.

Dumbrell, A., Nelson, M., Helgason, T., Dytham, C. & Fitter, A. (2010). Relative roles of niche and neutral processes in structuring a soil microbial community. The ISME Journal. 4(3):337-345.

Eddy, S. R. (2011). Accelerated Profile HMM Searches. PLoS Comput. Biol. 7(10):e1002195.

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research. 32(5):1792-1797.

Edgar, R. C. (2013). UPARSE: Highly accurate OTU sequences from microbial amplicon reads. Nature Methods. 10(10):996-998.

Edwards, R. J., Moran, N., Devocelle, M., Kiernan, A., Meade, G., Signac, W., Foy, M., Park, S. D. E., Dunne, E., Kenny, D. & Shields, D. C. (2007). Bioinformatic discovery of novel bioactive peptides. Nature chemical biology. 3(2):108-112.

Edwards, R. J. & Palopoli, N. (2015). Computational Prediction of Short Linear Motifs from Protein Sequences. Methods in Molecular Biology. 1268:89-141.

Edwards, R. J., Pérez-Bercoff, Ā., Russell, T. L., Attfield, P. V. & Bell, P. J. L. (2018). PacBio sequencing, de novo assembly and haplotype phasing of diploid yeast strains. 7:891.[Poster]. Available from: https://doi.org/10.7490/f1000research.1115667.1 [Accessed Apr, 2019].

Eisenreich, W., Bacher, A., Arigoni, D. & Rohdich, F. (2004). Biosynthesis of isoprenoids via the non-mevalonate pathway. Cellular and Molecular Life Sciences. 61(12):1401-1426.

211

Encheva-Malinova, M., Stoyanova, M., Avramova, H., Pavlova, Y., Gocheva, B., Ivanova, I. & Moncheva, P. (2014). Antibacterial potential of streptomycete strains from Antarctic soils. Biotechnology & Biotechnological Equipment. 28(4):1-7.

England, J., Smith, I. R. & Evans, D. J. A. (2000). The last glaciation of east-central Ellesmere Island, Nunavut: ice dynamics, deglacial chronology, and sea level change. Canadian Journal of Earth Sciences. 37(10):1355-1371.

Ernst, R., Ejsing, C. S. & Antonny, B. (2016). Homeoviscous Adaptation and the Regulation of Membrane Lipids. J. Mol. Biol. 428:4776-4791.

Falagas, M. E., Tansarli, G. S., Karageorgopoulos, D. E. & Vardakas, K. Z. (2014). Deaths attributable to carbapenem-resistant Enterobacteriaceae infections. Emerging infectious diseases. 20(7):1170-1175.

Faltermann, S., Zucchi, S., Kohler, E., Blom, J. F., Pernthaler, J. & Fent, K. (2014). Molecular effects of the cyanobacterial toxin cyanopeptolin (CP1020) occurring in algal blooms: Global transcriptome analysis in zebrafish embryos. Aquatic toxicology. 149:33-39.

Fechtner, J., Koza, A., Sterpaio, P. D., Hapca, S. M. & Spiers, A. J. (2011). Surfactants expressed by soil pseudomonads alter local soil-water distribution, suggesting a hydrological role for these compounds. FEMS Microbiology Ecology. 78(1):50-58.

Feng, Y.-N., Zhang, Z.-C., Feng, J.-L. & Qiu, B.-S. (2012). Effects of UV-B Radiation and Periodic Desiccation on the Morphogenesis of the Edible Terrestrial Cyanobacterium Nostoc flagelliforme. Applied and environmental microbiology. 78(19):7075.

Ferrari, B. C., Binnerup, S. J. & Gillings, M. (2005). Microcolony Cultivation on a Soil Substrate Membrane System Selects for Previously Uncultured Soil Bacteria. Applied and environmental microbiology. 71(12):8714.

Ferrari, B. C., Bissett, A., Snape, I., van Dorst, J., Palmer, A. S., Ji, M., Siciliano, S. D., Stark, J. S., Winsley, T. & Brown, M. V. (2015). Geological connectivity drives microbial community structure and connectivity in polar, terrestrial ecosystems. Environmental Microbiology. 18(6):1834–1849.

Ferrari, B. C., Winsley, T., Gillings, M. & Binnerup, S. (2008). Cultivating previously uncultured soil bacteria using a soil substrate membrane system. Nature Protocols. 3(8):1261.

Fierer, N., Bradford, M. A. & Jackson, R. B. (2007). Toward an Ecological Classification of Soil Bacteria. Ecology. 88(6):1354-1364.

Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J. & Bateman, A. (2016). The Pfam protein families database: towards a more sustainable future. Nucleic acids research. 44(D1):279-285.

212

Fischbach, M., Walsh, C. & Clardy, J. (2008). The evolution of gene collectives: How natural selection drives chemical innovation. Proceedings of the National Academy of Sciences of the United States of America. 105(12):4601-4608.

Fischbach, M. A. & Walsh, C. T. (2006). Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: Logic, machinery, and mechanisms. Chem. Rev. 106:3468- 3496.

Flärdh, K. & Buttner, M. J. (2009). Streptomyces morphogenetics: dissecting differentiation in a filamentous bacterium. Nature Reviews Microbiology. 7(1):36.

Flatman, R. H., Howells, A., Heide, L., Fiedler, H. & Maxwell, A. (2005). Simocyclinone D8, an inhibitor of DNA gyrase with a novel mode of action. Antimicrob. Agents Chemother. 49(3):1093-1100.

Fleischmann, R., Adams, M., White, O. & Clayton, R. (1995). Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 269(5223):496.

Fryirs, K., Snape, I. & Babicka, N. (2013). The type and spatial distribution of past waste at the abandoned Wilkes Station, East Antarctica. The Polar Record. 49(4):328-347.

Funa, N., Ozawa, H., Hirata, A. & Horinouchi, S. (2006). Phenolic lipid synthesis by type III polyketide synthases is essential for cyst formation in Azotobacter vinelandii. Proceedings of the National Academy of Sciences of the United States of America. 103(16):6356-6361.

Galet, J., Deveau, A., Hôtel, L., Frey-Klett, P., Leblond, P., Aigle, B. & Löffler, F. E. (2015). Pseudomonas fluorescens Pirates both Ferrioxamine and Ferricoelichelin Siderophores from . Applied and environmental microbiology. 81(9):3132-3141.

Galperin, M. Y. (2013). Genome Diversity of Spore-Forming Firmicutes. Microbiology Spectrum. 1(2):doi:10.1128.

Gan, M., Zheng, X., Gan, L., Guan, Y., Hao, X., Liu, Y., Si, S., Zhang, Y., Yu, L. & Xiao, C. (2011). Streptothricin Derivatives from Streptomyces sp . I08A 1776. Journal of Natural Products. 74(5):1142-1147.

Ganzert, L., Lipski, A., Hubberten, H.-W. & Wagner, D. (2011). The impact of different soil parameters on the community structure of dominant bacteria from nine different soils located on Livingston Island, South Shetland Archipelago, Antarctica. FEMS Microbiology Ecology. 76(3):476-491.

Gaspari, F., Paitan, Y., Mainini, M., Losi, D., Ron, E. Z. & Marinelli, F. (2005). Myxobacteria isolated in Israel as potential source of new anti‐infectives. Journal of applied microbiology. 98(2):429-439.

213

Gemperlein, K., Rachid, S., Garcia, R. O., Wenzel, S. C. & Mller, R. (2014). Polyunsaturated fatty acid biosynthesis in myxobacteria: different PUFA synthases and their product diversity. Chemical Science. 5(5):1733-1741.

Genthon, C., Berne, A., Grazioli, J., Durán Alarcón, C., Praz, C. & Boudevillain, B. (2018). Precipitation at Dumont d'Urville, Adélie Land, East Antarctica: the APRES3 field campaigns dataset. Earth System Science Data. 10(3):1605-1612.

Gentile, G., Bonasera, V., Amico, C., Giuliano, L. & Yakimov, M. M. (2003). Shewanella sp. GA-22, a psychrophilic hydrocarbonoclastic antarctic bacterium producing polyunsaturated fatty acids. Journal of applied microbiology. 95(5):1124-1133.

Gesheva, V. (2010). Production of antibiotics and enzymes by soil microorganisms from the windmill islands region, Wilkes Land, East Antarctica. Polar Biology. 33(10):1351-1357.

Gilbert, D. N. (2010). The 10 × ‘20 Initiative: Pursuing a Global Commitment to Develop 10 New Antibacterial Drugs by 2020. Clinical infectious diseases. 50(8):1081-1083.

Goering, A. W., McClure, R. A., Doroghazi, J. R., Albright, J. C., Haverland, N. A., Zhang, Y., Ju, K.-S., Thomson, R. J., Metcalf, W. W. & Kelleher, N. L. (2016). Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer. ACS Central Science. 2(2):99-108.

Goldstein, S., Beka, L., Graf, J. & Klassen, J. L. (2019). Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing. BMC genomics. 20(1):23.

Gomez-Escribano, J., Alt, S. & Bibb, M. (2016). Next Generation Sequencing of Actinobacteria for the Discovery of Novel Natural Products. Marine drugs. 14(4):78.

Gontang, E. A., Gaudencio, S. P., Fenical, W. & Jensen, P. R. (2010). Sequence-Based Analysis of Secondary-Metabolite Biosynthesis in Marine Actinobacteria. Applied and environmental microbiology. 76(8):2487-2499.

Goodwin, I. D. (1993). Holocene Deglaciation, Sea-Level Change, and the Emergence of the Windmill Islands, Budd Coast, Antarctica. Quaternary research. 40(1):70-80.

Goodwin, S., McPherson, J. D. & McCombie, W. R. (2016). Coming of age: ten years of next-generation sequencing technologies. Nature Reviews. 17(6):333-351.

Goryachkin, S. V., Karavaeva, N. A., Targulian, V. O. & Glazov, M. V. (1999). Arctic soils: spatial distribution, zonality and transformation due to global change. Permafrost and Periglacial Processes. 10(3):235-250.

Greening, C., Constant, P., Hards, K., Morales, S. E., Oakeshott, J. G., Russell, R. J., Taylor, M. C., Berney, M., Conrad, R., Cook, G. M. & Müller, V. (2015). Atmospheric

214

Hydrogen Scavenging: from Enzymes to Ecosystems. Applied and Environmental Microbiology. 81(4):1190-1199.

Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. (2010). New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology. 59(3):307-321.

Guindon, S. & Gascuel, O. (2003). A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Systematic Biology. 52(5):696-704.

Gupta, R. S. (2000). The phylogeny of proteobacteria: relationships to other eubacterial phyla and eukaryotes. FEMS Microbiology Review. 24:367-402.

Hanada, S. (2014). The Phylum Chloroflexi, the Family Chloroflexaceae, and the Related Phototrophic Families Oscillochloridaceae and Roseiflexaceae. In: Rosenberg, E., et al. (eds.) The Prokaryotes. Berlin, Heidelberg.

Hansen, B. B., Isaksen, K., Benestad, R. E., Kohler, J., Pedersen, Å. Ø., Loe, L. E., Coulson, S. J., Larsen, J. O. & Varpe, Ø. (2014). Warmer and wetter winters: characteristics and implications of an extreme weather event in the high arctic. Environmental Research Letters. 9(11):114021.

Haritash, A. K. & Kaushik, C. P. (2009). Biodegradation aspects of Polycyclic Aromatic Hydrocarbons (PAHs): A review. Journal of Hazardous Materials. 169(1-3):1-15.

Harvey, A. L., Edrada-Ebel, R. & Quinn, R. J. (2015). The re-emergence of natural products for drug discovery in the genomics era. Nature Reviews: Drug Discovery. 14(2):111-129.

Hashim, N., Bharudin, I., Nguong, D., Higa, S., Bakar, F., Nathan, S., Rabu, A., Kawahara, H., Illias, R., Najimudin, N., Mahadi, N. & Murad, A. (2013). Characterization of Afp1, an antifreeze protein from the psychrophilic yeast Glaciozyma antarctica PI12. Microbial Life Under Extreme Conditions. 17(1):63-73.

Hassanshahian, M., Emtiazi, G. & Cappello, S. (2012). Isolation and characterization of crude-oil-degrading bacteria from the Persian Gulf and the Caspian Sea. Marine pollution bulletin. 64(1):7-12.

Hayakawa, M. & Nonomura, H. (1987). Humic acid-vitamin agar, a new medium for the selective isolation of soil actinomycetes. Journal of Fermentation Technology. 65(5):501- 509.

Helfrich, E. J. N. & Piel, J. (2016). Biosynthesis of polyketides by trans -AT polyketide synthases. Nat. Prod. Rep. 33(2):231-316.

Herrmann, J., Fayad, A. A. & Müller, R. (2017). Natural products from myxobacteria: novel metabolites and bioactivities. Natural Product Reports. 34(2):135-160.

215

Hertweck, C. (2009). The Biosynthetic Logic of Polyketide Diversity. Angewandte Chemie. 48:4688-4716.

Hicks, L. C., Ang, R., Leizeaga, A. & Rousk, J. (2019). Bacteria constrain the fungal growth response to drying-rewetting. Soil Biology and Biochemistry. 134:108-112.

Hoefler, B. C., Konganti, K. & Straight, P. D. (2013). De Novo Assembly of the Streptomyces sp. Strain Mg1 Genome Using PacBio Single-Molecule Sequencing. Genome Announcements. 1(4).

Hopwood, D. A. (2007). Streptomyces in Nature and Medicine: The Antibiotic Makers, Oxford University Press.

Hugenholtz, P., Skarshewski, A. & Parks, D. H. (2016). Genome-Based Microbial Taxonomy Coming of Age. Cold Spring Harbor Perspectives in Biology. 8(6):a018085.

Hunt, M., Silva, N. D., Otto, T. D., Parkhill, J., Keane, J. A. & Harris, S. R. (2015). Circlator: automated circularization of genome assemblies using long sequencing reads. Genome biology. 16(1):294.

Hyatt, D., Chen, G.-L., LoCascio, P. F., Land, M. L., Larimer, F. W. & Hauser, L. J. (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 11(1):119.

Imhoff, J. F., Rahn, T., Künzel, S. & Neulinger, S. C. (2018). Photosynthesis Is Widely Distributed among Proteobacteria as Demonstrated by the Phylogeny of PufLM Reaction Center Proteins. Frontiers in Microbiology. 8(2679).

Infectious Diseases Society of America (IDSA). (2004). Bad Bugs, No Drugs: As Antibiotic Discovery Stagnates A Public Health Crisis Brews [Online]. Available: https://www.idsociety.org/globalassets/idsa/policy-- advocacy/current_topics_and_issues/antimicrobial_resistance/10x20/statements-manually- added/070104-as-antibiotic-discovery-stagnates-a-public-health-crisis-brews.pdf [Accessed June 2018].

Isaksen, K., Nordli, Ø., Førland, E. J., Łupikasza, E., Eastwood, S. & Niedźwiedź, T. (2016). Recent warming on Spitsbergen—Influence of atmospheric circulation and sea ice cover. Journal of Geophysical Research: Atmospheres. 121(20):11,913-11,931.

Janek, T., Łukaszewicz, M., Rezanka, T. & Krasowska, A. (2010). Isolation and characterization of two new lipopeptide biosurfactants produced by Pseudomonas fluorescens BD5 isolated from water from the Arctic Archipelago of Svalbard. Bioresource technology. 101(15):6118.

Janet, K. J. & Neslihan, T. (2014). The microbial ecology of permafrost. Nature Reviews Microbiology. 12(6):414.

216

Janssen, P. H., Yates, P. S., Grinton, B. E., Taylor, P. M. & Sait, M. (2002). Improved Culturability of Soil Bacteria and Isolation in Pure Culture of Novel Members of the Divisions Acidobacteria, Actinobacteria, Proteobacteria, and Verrucomicrobia. Applied and environmental microbiology. 68(5):2391.

Jansson, J. K. & Taş, N. (2014). The microbial ecology of permafrost. Nature Reviews Microbiology. 12(6):414.

Jenke-Kodama, H., Sandmann, A., Müller, R. & Dittmann, E. (2005). Evolutionary Implications of Bacterial Polyketide Synthases. Molecular Biology and Evolution. 22(10):2027-2039.

Jensen, P. R. & Mafnas, C. (2006). Biogeography of the marine actinomycete Salinispora. Environmental Microbiology. 8(11):1881-1888.

Ji, M., Greening, C., Vanwonterghem, I., Carere, C. R., Bay, S. K., Steen, J. A., Montgomery, K., Lines, T., Beardall, J., van Dorst, J., Snape, I., Stott, M. B., Hugenholtz, P. & Ferrari, B. C. (2017). Atmospheric trace gases support primary production in Antarctic desert surface soil. Nature. 552(7685):400-403.

Ji, M., van Dorst, J., Bissett, A., Brown, M. V., Palmer, A. S., Snape, I., Siciliano, S. D. & Ferrari, B. C. (2016). Microbial diversity at Mitchell Peninsula, Eastern Antarctica: a potential biodiversity “hotspot”. Polar Biology. 39(2):237-249.

Jorgensen, H., Fjaervik, E., Hakvag, S., Bruheim, P., Bredholt, H., Klinkenberg, G., Ellingsen, T. E. & Zotchev, S. B. (2009). Candicidin Biosynthesis Gene Cluster Is Widely Distributed among Streptomyces spp. Isolated from the Sediments and the Neuston Layer of the Trondheim Fjord, Norway. Applied and environmental microbiology. 75(10):3296-3303.

Joynt, R. & Seipke, R. F. (2018). A phylogenetic and evolutionary analysis of antimycin biosynthesis. Microbiology. 164(1):28-39.

Kaeberlein, T., Lewis, K. & Epstein, S. S. (2002). Isolating "uncultivable" microorganisms in pure culture in a simulated natural environment. Science. 296(5570):1127-1129.

Kamat, N. M. & Velho-Pereira, S. (2011). Antimicrobial screening of actinobacteria using a modified cross-streak method. Indian journal of pharmaceutical sciences. 73(2):223-228.

Kampfer, P., Rainey, F. A., Andersson, M. A., Nurmiaho Lassila, L. E., Ulrych, U., Busse, H., Weiss, N., Mikkola, R. & Salkinoja-Salonen, M. (2000). Frigoribacterium faeni gen. nov., sp. nov., a novel psychrophilic genus of the family . International journal of systematic and evolutionary microbiology. 50(1):355-363.

Karandikar, A., Sharples, G. & Hobbs, G. (1996). Influence of medium composition on sporulation by Streptomyces coelicolor A3(2) grown on defined solid media. Biotechnology Techniques. 10(2):79-82.

Karlovsky, P. (2008). Secondary metabolites in soil ecology, Berlin, Springer. 217

Karwowski, J., Sunga, G., Kadam, S. & McAlpine, J. (1996). A method for the selective isolation of Myxococcus directly from soil. Journal of Industrial Microbiology. 16(4):230- 236.

Katoh, K. & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution. 30(4):772.

Katz, L. & Baltz, R. (2016). Natural product discovery: past, present, and future. Official Journal of the Society for Industrial Microbiology and Biotechnology. 43(2):155-176.

Katz, M., Hover, B. & Brady, S. (2016). Culture-independent discovery of natural products from soil metagenomes. Official Journal of the Society for Industrial Microbiology and Biotechnology. 43(2):129-141.

Kawatani, M., Muroi, M., Wada, A., Inoue, G., Futamura, Y., Aono, H., Shimizu, K., Shimizu, T., Igarashi, Y., Takahashi-Ando, N. & Osada, H. (2016). Proteomic profiling reveals that collismycin A is an iron chelator. Scientific Reports. 6(38385):1-9.

Keatinge-clay, A. T. (2012). The structures of type I polyketide synthases. Nat. Prod. Rep. 29(10):1050-1073.

Keatinge-Clay, A. T., Maltby, D. A., Medzihradszky, K. F., Khosla, C. & Stroud, R. M. (2004). An antibiotic factory caught in action. Nature Structural & Molecular Biology. 11(9):888.

Keller-Juslén, C., King, H. D., Kuhn, M. A. X., Loosli, H.-R., Pache, W., Petcher, T. J., Weber, H. P. & Wartburg, A. V. (1982). Tetronomycin, a novel polyether of unusual structure. Journal of Antibiotics. 35(2):142-150.

Kersters, K., De Vos, P., Gillis, M., Swings, J., Vandamme, P. & Stackebrandt, E. (2006). Introduction to the Proteobacteria. In: Dworkin, M., et al. (eds.) The Prokaryotes: Volume 5: Proteobacteria: Alpha and Beta Subclasses. New York, NY: Springer New York.

Kiernan, K. & McConnell, A. (2001). Impacts of geoscience research on the physical environment of the Vestfold Hills, Antarctica. Australian Journal of Earth Sciences. 48(5):767-768.

Kieser, T., Bibb, M. J., Buttner, M. J., Chater, K. F. & Hopwood, D. A. (2000). Practical Streptomyces Genetics, Norwich, England, John Innes Foundation.

Kim, J.-N., Kim, Y., Jeong, Y., Roe, J.-H., Kim, B.-G. & Cho, B.-K. (2015). Comparative Genomics Reveals the Core and Accessory Genomes of Streptomyces Species. Journal of microbiology and biotechnology. 25(10):1599-1605.

Kim, J. & Yi, G.-S. (2012). PKMiner: a database for exploring type II polyketide synthases. BMC Microbiology. 12(1):169.

218

Kingston, D. G. I. & Kolpak, M. X. (1980). Biosynthesis of antibiotics of the virginiamycin family. 1. Biosynthesis of virginiamycin M1: determination of the labeling pattern by the use of stable isotope techniques. Journal of the American Chemical Society. 102(18):5964-5966.

Koblížek, M. & Brussaard, C. (2015). Ecology of aerobic anoxygenic phototrophs in aquatic environments. FEMS microbiology reviews. 39(6):854-870.

Komaki, H., Fudou, R., Iizuka, T., Nakajima, D., Okazaki, K., Shibata, D., Ojika, M. & Harayama, S. (2008). PCR Detection of Type I Polyketide Synthase Genes in Myxobacteria. Applied and environmental microbiology. 74(17):5571-5574.

Komaki, H., Sakurai, K., Hosoyama, A., Kimura, A., Igarashi, Y. & Tamura, T. (2018). Diversity of nonribosomal peptide synthetase and polyketide synthase gene clusters among taxonomically close Streptomyces strains. Scientific Reports. 8(1):6888.

Kong, D. (2016). Frigoribacterium salinisoli sp. nov., isolated from saline soil, transfer of Frigoribacterium mesophilum to Parafrigoribacterium gen. nov. as Parafrigoribacterium mesophilum comb. nov. International journal of systematic and evolutionary microbiology. 66(12):5252-5259.

Koren, S., Harhay, G., Bono, J., Harhay, D., McVey, D., Radune, D., Bergman, N. & Phillippy, A. (2013). Reducing assembly complexity of microbial genomes with single- molecule sequencing. Genome biology. 14(9):R101.

Koren, S., Walenz, B. P., Berlin, K., Miller, J. R., Bergman, N. H. & Phillippy, A. M. (2017). Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation. Genome Research. 27(5):722-736.

Kralova, S. (2017). Role of fatty acids in cold adaptation of Antarctic psychrophilic Flavobacterium spp. Syst. Appl. Microbiol. 40(6):329-333.

Krehenwinkel, H., Wolf, M., Lim, J. Y., Rominger, A. J., Simison, W. B. & Gillespie, R. G. (2017). Estimating and mitigating amplification bias in qualitative and quantitative arthropod metabarcoding. Scientific Reports. 7(1):17668.

Krick, A., Kehraus, S., Eberl, L., Riedel, K., Anke, H., Kaesler, I., Graeber, I., Szewzyk, U. & Konig, G. M. (2007). A Marine Mesorhizobium sp. Produces Structurally Novel Long- Chain N-Acyl-L-Homoserine Lactones. Applied and environmental microbiology. 73(11):3587-3594.

Kumarasamy, K. K., Toleman, M. A., Walsh, T. R., Bagaria, J., Butt, F., Balakrishnan, R., Chaudhary, U., Doumith, M., Giske, C. G., Irfan, S., Krishnan, P., Kumar, A. V., Maharjan, S., Mushtaq, S., Noorie, T., Paterson, D. L., Pearson, A., Perry, C., Pike, R. & Rao, B. (2010). Emergence of a new antibiotic resistance mechanism in India, Pakistan, and the UK: a molecular, biological, and epidemiological study. The Lancet Infectious Diseases. 10(9):597-602.

219

Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C. & Salzberg, S. L. (2004). Versatile and open software for comparing large genomes. Genome biology. 5(2):R12-R12.

Labeda, D. P., Dunlap, C. A., Rong, X., Huang, Y., Doroghazi, J. R., Ju, K.-S. & Metcalf, W. W. (2017). Phylogenetic relationships in the family using multi-locus sequence analysis. Antonie van Leeuwenhoek. 110(4):563-583.

Lagesen, K., Hallin, P., Rødland, E. A., Staerfeldt, H.-H., Rognes, T. & Ussery, D. W. (2007). RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic acids research. 35(9):3100.

Lane, D. J., Pace, B., Olsen, G. J., Stahl, D. A., Sogin, M. L. & Pace, N. R. (1985). Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proceedings of the National Academy of Sciences of the United States of America. 82(20):6955-6959.

Larkin, M. A., Blackshields, G., Brown, N. P., Chenna, R., McGettigan, P. A., McWilliam, H., Valentin, F., Wallace, I. M., Wilm, A., Lopez, R., Thompson, J. D., Gibson, T. J. & Higgins, D. G. (2007). Clustal W and Clustal X version 2.0. Bioinformatics. 23(21):2947-2948.

Laureti, L., Song, L., Huang, S., Corre, C., Leblond, P., Challis, G. L. & Aigle, B. (2011). Identification of a bioactive 51-membered macrolide complex by activation of a silent polyketide synthase in Streptomyces ambofaciens. Proceedings of the National Academy of Sciences of the United States of America. 108(15):6258.

Lazzarini, A., Cavaletti, L., Toppo, G. & Marinelli, F. (2000). Rare genera of actinomycetes as potential producers of new antibiotics. International Journal of General and Molecular Microbiology. 78(3):399-405.

Leck, A. (1999). Preparation of lactophenol cotton blue slide mounts. Community Eye Health. 12(30):24.

Lee, L.-H., Cheah, Y.-K., Mohd Sidik, S., Ab Mutalib, N.-S., Tang, Y.-L., Lin, H.-P. & Hong, K. (2012). Molecular characterization of Antarctic actinobacteria and screening for antimicrobial metabolite production. World Journal of Microbiology and Biotechnology. 28(5):2125-2137.

Lemetre, C., Maniko, J., Charlop-Powers, Z., Sparrow, B., Lowe, A. J. & Brady, S. F. (2017). Bacterial natural product biosynthetic domain composition in soil correlates with changes in latitude on a continent-wide scale. Proceedings of the National Academy of Sciences of the United States of America. 114(44):11615.

Lennon, J. T. & Jones, S. E. (2011). Microbial seed banks: the ecological and evolutionary implications of dormancy. Nature Reviews. 9(2):119-130.

Lesins, G., Duck, T. J. & Drummond, J. R. (2010). Climate trends at Eureka in the Canadian high arctic. Atmosphere-ocean. 48(2):59-80. 220

Letunic, I. & Bork, P. (2016). Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic acids research. 44(1):W242-245.

Lever, M. A., Rogers, K. L., Lloyd, K. G., Overmann, J., Schink, B., Thauer, R. K., Hoehler, T. M., Jørgensen, B. B. & Giudici-Orticoni, M.-T. (2015). Life under extreme energy limitation: a synthesis of laboratory- and field-based investigations. FEMS microbiology reviews. 39(5):688-728.

Lévesque, E. 1997. Plant distribution and colonization in extreme polar deserts, Ellesmere Island, Canada. Doctor of Philosophy, University of Toronto.

Levy, S. E. & Myers, R. M. (2016). Advancements in Next-Generation Sequencing. Annu. Rev. Genom. Hum. Genet. 17(1):95-115.

Lewis, K. (2013). Platforms for antibiotic discovery. Nature Reviews Drug Discovery. 12(5):371.

Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA- MEM. arXiv:1303.3997v2.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G. & Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Computer Applications in the Biosciences. 25(16):2078-2079.

Li, J., Nation, R. L., Turnidge, J. D., Milne, R. W., Coulthard, K., Rayner, C. R. & Paterson, D. L. (2006). Colistin: the re- emerging antibiotic for multidrug- resistant Gram- negative bacterial infections. The Lancet Infectious Diseases. 6(9):589.

Li, W., Ju, J., Rajski, S. R., Shen, B. & Osada, H. (2008). Characterization of the tautomycin biosynthetic gene cluster from Streptomyces spiroverticillatus unveiling new insights into dialkylmaleic anhydride and polyketide biosynthesis. Journal of Biological Chemistry. 283(42):28607-28617.

Liermann, L. J., Kalinowski, B. E., Brantley, S. L. & Ferry, J. G. (2000). Role of bacterial siderophores in dissolution of hornblende. Geochimica et cosmochimica acta. 64(4):587-602.

Ling, L. L., Schneider, T., Peoples, A. J., Spoering, A. L., Engels, I., Conlon, B. P., Mueller, A., Schäberle, T. F., Hughes, D. E., Epstein, S., Jones, M., Lazarides, L., Steadman, V. A., Cohen, D. R., Felix, C. R., Fetterman, K. A., Millett, W. P., Nitti, A. G., Zullo, A. M., Chen, C. & Lewis, K. (2015). A new antibiotic kills pathogens without detectable resistance. Nature. 517(7535).

Liu, G., Chater, K., Chandra, G., Niu, G. & Tan, H. (2013). Molecular Regulation of Antibiotic Biosynthesis in Streptomyces. Microbiology and Molecular Biology Reviews. 77:112.

221

Liu, W., Christenson, S. D., Standage, S. & Shen, B. (2002). Biosynthesis of the Enediyne Antitumor Antibiotic C-1027. Science. 297(5584):1170-1173.

Liu, Y.-Y., Wang, Y., Walsh, T. R., Yi, L.-X., Zhang, R., Spencer, J., Doi, Y., Tian, G., Dong, B., Huang, X., Yu, L.-F., Gu, D., Ren, H., Chen, X., Lv, L., He, D., Zhou, H., Liang, Z., Liu, J.-H. & Shen, J. (2016). Emergence of plasmid-mediated colistin resistance mechanism MCR-1 in animals and human beings in China: a microbiological and molecular biological study. The Lancet Infectious Diseases. 16(2):161-168.

Lloyd, K. G., Steen, A. D., Ladau, J., Yin, J., Crosby, L. & Neufeld, J. D. (2018). Phylogenetically Novel Uncultured Microbial Cells Dominate Earth Microbiomes. MSystems. 3(5):e00055-18.

Lobanovska, M. & Pilla, G. (2017). Penicillin’s Discovery and Antibiotic Resistance: Lessons for the Future? The Yale Journal of Biology and Medicine. 90(1):135-145.

Lundberg, K. S., Shoemaker, D. D., Adams, M. W., Short, J. M., Sorge, J. A. & Mathur, E. J. (1991). High-fidelity amplification using a thermostable DNA polymerase isolated from Pyrococcus furiosus. Gene. 108(1):1.

Lynch, M. (2006). Streamlining and Simplification of Microbial Genome Architecture. Annual Review of Microbiology. 60(1):327-349.

Maaloum, M., Diop, K., Diop, A., Anani, H., Tomei, E., Richez, M., Rathored, J., Bretelle, F., Raoult, D., Fenollar, F. & Fournier, P.-E. (2019). Description of Janibacter massiliensis sp. nov., cultured from the vaginal discharge of a patient with bacterial vaginosis. Antonie van Leeuwenhoek.:1572-9699.

MacNair, C. R., Stokes, J. M., Carfrae, L. A., Fiebig-Comyn, A. A., Coombes, B. K., Mulvey, M. R. & Brown, E. D. (2018). Overcoming mcr-1 mediated colistin resistance with colistin in combination with other antibiotics. Nature Communications. 9(458):1-8.

Maestre, F. T., Delgado-Baquerizo, M., Jeffries, T. C., Eldridge, D. J., Ochoa, V., Gozalo, B., Quero, J. L., García-Gómez, M., Gallardo, A., Ulrich, W., Bowker, M. A., Arredondo, T., Barraza-Zepeda, C., Bran, D., Florentino, A., Gaitán, J., Gutiérrez, J. R., Huber-Sannwald, E., Jankju, M. & Mau, R. L. (2015). Increasing aridity reduces soil microbial diversity and abundance in global drylands. Proceedings of the National Academy of Sciences of the United States of America. 112(51):201516684-15689.

Makhalanyane, T. P., Valverde, A., Gunnigle, E., Frossard, A., Ramond, J.-B. & Cowan, D. A. (2015a). Microbial ecology of hot desert edaphic systems. FEMS microbiology reviews. 39(2):203-221.

Makhalanyane, T. P., Valverde, A., Velazquez, D., Gunnigle, E., Van Goethem, M., Quesada, A. & Cowan, D. (2015b). Ecology and biogeochemistry of cyanobacteria in soils, permafrost, aquatic and cryptic polar habitats. Biodivers. Conserv. 24:819-840.

222

Mansson, M., Gram, L. & Larsen, T. (2011). Production of Bioactive Secondary Metabolites by Marine Vibrionaceae. Mar. Drugs. 9:1440-1468.

Maplestone, R. A., Stone, M. J. & Williams, D. H. (1992). The evolutionary role of secondary metabolites-a review. Gene. 115(1-2):151.

Maresca, J. A., Graham, J. E. & Bryant, D. A. (2008). The biochemical basis for structural diversity in the carotenoids of chlorophototrophic bacteria. Photosynthesis research. 97(2):121-140.

Margesin, R. & Miteva, V. (2011). Diversity and ecology of psychrophilic microorganisms. Research in microbiology. 162(3):346-361.

Marizcurrena, J. J., Morales, D., Smircich, P., Castro-Sowinski, S. & Dennehy, J. J. (2019). Draft Genome Sequence of the UV-Resistant Antarctic Bacterium Sphingomonas sp. Strain UV9. Microbiology Resource Announcements. 8(7):e01651-18.

Masschelein, J., Clauwers, C., Awodi, U. R., Stalmans, K., Vermaelen, W., Lescrinier, E., Aertsen, A., Michiels, C., Challis, G. L. & Lavigne, R. (2015). A combination of polyunsaturated fatty acid, nonribosomal peptide and polyketide biosynthetic machinery is used to assemble the zeamine antibiotics. Chemical Science. 6(2):923-929.

Masschelein, J., Jenner, M. & Challis, G. L. (2017). Antibiotics from Gram-negative bacteria: a comprehensive overview and selected biosynthetic highlights. Natural Product Reports. 34(7):712-783.

Matsen, F. A., Kodner, R. B. & Armbrust, E. V. (2010). pplacer: linear time maximum- likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics. 11:538.

McBride, M. J., Liu, W., Lu, X., Zhu, Y. & Zhang, W. (2014). The Family Cytophagaceae. In: Rosenberg, E., et al. (eds.) The Prokaryotes. Berlin, Heidelberg.

Medema, M. H., Blin, K., Cimermancic, P., de Jager, V., Zakrzewski, P., Fischbach, M. A., Weber, T., Takano, E. & Breitling, R. (2011). antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic acids research. 39(2):W339-W346.

Medema, M. H., Cimermancic, P., Sali, A., Takano, E., Fischbach, M. A. & Ouzounis, C. A. (2014). A Systematic Computational Analysis of Biosynthetic Gene Cluster Evolution: Lessons for Engineering Biosynthesis. PLoS Computational Biology. 10(12):e1004016.

Medema, M. H., Kottmann, R., Yilmaz, P., Cummings, M., Biggins, J. B., Blin, K., de Bruijn, I., Chooi, Y. H., Claesen, J., Coates, R. C., Cruz-Morales, P., Duddela, S., Düsterhus, S., Edwards, D. J., Fewer, D. P., Garg, N., Geiger, C., Gomez-Escribano, J. P., Greule, A. & Hadjithomas, M. (2015). Minimum Information about a Biosynthetic Gene cluster. Nature chemical biology. 11(9):625-631.

223

Melick, D., Hovenden, M. & Seppelt, R. (1994). Phytogeography of bryophyte and lichen vegetation in the Windmill Islands, Wilkes Land, Continental Antarctica. Vegetatio. 111(1):71-87.

Migita, A., Watanabe, M., Hirose, Y., Watanabe, K., Tokiwano, T., Kinashi, H. & Oikawa, H. (2009). Identification of a Gene Cluster of Polyether Antibiotic Lasalocid from Streptomyces lasaliensis. Bioscience, biotechnology, and biochemistry. 73(1):169-176.

Miller, I., Chevrette, M. & Kwan, J. (2017). Interpreting Microbial Biosynthesis in the Genomic Age: Biological and Practical Considerations. Marine drugs. 15(6):165.

Milshteyn, A., Schneider, Jessica S. & Brady, Sean F. (2014). Mining the Metabiome: Identifying Novel Natural Products from Microbial Communities. Chemistry & biology. 21(9):1211-1223.

Moffitt, M. C. & Neilan, B. A. (2003). Evolutionary Affiliations Within the Superfamily of Ketosynthases Reflect Complex Pathway Associations. Journal of Molecular Evolution. 56(4):446-457.

Montero-Calasanz, M. d., Göker, M., Pötter, G., Rohde, M., Spröer, C., Schumann, P., Gorbushina, A. & Klenk, H.-P. (2013). Geodermatophilus africanus sp. nov., a halotolerant actinomycete isolated from Saharan desert sand. Journal of Microbiology. 104(2):207-216.

Morita, R. Y. (1975). Psychrophilic bacteria. Bacteriological reviews. 39(2):144.

Müller, C. A., Oberauner-Wappis, L., Peyman, A., Amos, G. C. A., Wellington, E. M. H. & Berg, G. (2015). Mining for Nonribosomal Peptide Synthetase and Polyketide Synthase Genes Revealed a High Level of Diversity in the Sphagnum Bog Metagenome. Applied and environmental microbiology. 81(15):5064.

Musilova, M., Wright, G., Ward, J. M. & Dartnell, L. R. (2015). Isolation of Radiation- Resistant Bacteria from Mars Analog Antarctic Dry Valleys by Preselection, and the Correlation between Radiation and Desiccation Resistance. Astrobiology. 15(12):1076-1090.

Nah, H.-J., Pyeon, H.-R., Kang, S.-H., Choi, S.-S. & Kim, E.-S. (2017). Cloning and Heterologous Expression of a Large-sized Natural Product Biosynthetic Gene Cluster in Streptomyces Species. Frontiers in Microbiology. 8(394).

Nakano, K., Shiroma, A., Shimoji, M., Tamotsu, H., Ashimine, N., Ohki, S., Shinzato, M., Minami, M., Nakanishi, T., Teruya, K., Satou, K. & Hirano, T. (2017). Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area. Human cell. 30(3):149-161.

Narvaez-Reinaldo, J. J., Barba, I., Gonzalez-Lopez, J., Tunnacliffe, A. & Manzanera, M. (2010). Rapid Method for Isolation of Desiccation-Tolerant Strains and Xeroprotectants. Applied and environmental microbiology. 76(15):5254-5262.

224

National Center for Biotechnology Information (NCBI). (2019). Genome Database [Online]. Available: https://www.ncbi.nlm.nih.gov/genome [Accessed Feb 2019].

Nett, M., Ikeda, H. & Moore, B. S. (2009). Genomic basis for natural product biosynthetic diversity in the actinomycetes. Natural Product Reports. 26(11):1362-1384.

Newman, D. J. & Cragg, G. M. (2012). Natural products as sources of new drugs over the 30 years from 1981 to 2010. Journal of Natural Products. 75(3):311-335.

Newman, D. J., Cragg, G. M. & Grothaus, P. (eds.) (2017). Chemical Biology of Natural Products, Florida: CRC Press.

Nicetic, I. 2016. Isolating Secondary Metabolite Producing Bacteria from Antarctic Desert Soils. Bachelor of Biotechnology (Honours), University of New South Wales.

Nichols, D., Bowman, J., Sanderson, K., Nichols, C. M., Lewis, T., McMeekin, T. & Nichols, P. D. (1999). Developments with Antarctic microorganisms: culture collections, bioactivity screening, taxonomy, PUFA production and cold-adapted enzymes. Curr. Opin. Biotechnol. 10(3):240-246.

Nichols, D., Cahoon, N., Trakhtenberg, E. M., Pham, L., Mehta, A., Belanger, A., Kanigan, T., Lewis, K. & Epstein, S. S. (2010a). Use of Ichip for High-Throughput In Situ Cultivation of "Uncultivable" Microbial Species. Applied and environmental microbiology. 76(8):2445-2450.

Nichols, D. S., Nichols, P. D. & McMeekin, T. A. (1993). Polyunsaturated fatty acids in Antarctic bacteria. Antarctic science. 5(2):149-160.

Nichols, P. D., Petrie, J. & Singh, S. (2010b). Long-chain omega-3 oils-an update on sustainable sources. Nutrients. 2(6):572.

Nicolaou, K. C. & Dai, W. M. (1991). Chemistry and Biology of the Enediyne Anticancer Antibiotics. Angewandte Chemie. 30(11):1387-1416.

Nikrad, M. P., Kerkhof, L. J., Häggblom, M. M. & Muyzer, G. (2016). The subzero microbiome: microbial activity in frozen and thawing soils. FEMS Microbiology Ecology. 92(6).

Niu, B., Fu, L., Zhu, Z., Wu, S. & Li, W. (2011). WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC genomics. 12(1):444.

Normand, P. (2006). Geodermatophilaceae fam. nov., a formal description. International journal of systematic and evolutionary microbiology. 56(Pt 10):2277.

Norwegian Meteorological Institute and Norwegian Broadcasting Corporation (NMI). (2017). Weather statistics for Longyearbyen (Svalbard) [Online]. Available: https://www.yr.no/place/Norway/Svalbard/Longyearbyen/statistics.html [Accessed June 12 2017]. 225

Norwegian Polar Institute (NPI). (2016). Map - TopoSvalbard [Online]. Available: https://toposvalbard.npolar.no/ [Accessed Oct 2017].

Nouioui, I., Carro, L., García-López, M., Meier-Kolthoff, J. P., Woyke, T., Kyrpides, N. C., Pukall, R., Klenk, H.-P., Goodfellow, M. & Göker, M. (2018). Genome-Based Taxonomic Classification of the Phylum Actinobacteria. Frontiers in Microbiology. 9:2007.

Núñez-Pons, L. & Avila, C. (2015). Natural products mediating ecological interactions in Antarctic benthic communities: a mini-review of the known molecules. Natural Product Reports. 32(7):1114-1130.

Nurk, S., Bankevich, A., Antipov, D., Gurevich, A., Korobeynikov, A., Lapidus, A., Prjibelsky, A., Pyshkin, A., Sirotkin, A., Sirotkin, Y., Stepanauskas, R., McLean, J., Lasken, R., Clingenpeel, S. R., Woyke, T., Tesler, G., Alekseyev, M. A. & Pevzner, P. A. (2013). Assembling genomes and mini-metagenomes from highly chimeric reads. Research in Computational Molecular Biology. 7821:158-170.

O'Malley, M. A. (2007). The nineteenth century roots of 'everything is everywhere'. Nature reviews. 5(8):647-651.

Obbels, D., Verleyen, E., Mano, M.-J., Namsaraev, Z., Sweetlove, M., Tytgat, B., Fernandez-Carazo, R., De Wever, A., D'Hondt, S., Ertz, D., Elster, J., Sabbe, K., Willems, A., Wilmotte, A., Vyverman, W. & Margesin, R. (2016). Bacterial and eukaryotic biodiversity patterns in terrestrial and aquatic habitats in the Sør Rondane Mountains, Dronning Maud Land, East Antarctica. FEMS Microbiology Ecology. 92(6):fiw041-13.

Ofria, C., Adami, C. & Collier, T. C. (2003). Selective pressures on genomes in molecular evolution. Journal of Theoretical Biology. 222(4):477-483.

Oksanen, J., Blanchet, F. G., Friendly, M., Kindt, R., Legendre, P., McGlinn, D., Minchin, P. R., O'Hara, R. B., Simpson, G. L., Solymos, P., Stevens, M. H. H., E., S. & Wagner, H. (2017). vegan: Community Ecology Package. R package version 2.4-3 [Online]. Available: https://CRAN.R-project.org/package=vegan [Accessed May 2017].

Okuyama, H., Orikasa, Y., Nishida, T., Watanabe, K. & Morita, N. (2007). Bacterial Genes Responsible for the Biosynthesis of Eicosapentaenoic and Docosahexaenoic Acids and Their Heterologous Expression. Applied and environmental microbiology. 73(3):665- 670.

Olano, C., García, I., González, A., Rodriguez, M., Rozas, D., Rubio, J., Sánchez‐ Hidalgo, M., Braña, A. F., Méndez, C. & Salas, J. A. (2014). Activation and identification of five clusters for secondary metabolites in J1074. Microbial Biotechnology. 7(3):242-256.

Olano, C., Méndez, C. & Salas, J. (2009). Antitumor Compounds from Marine Actinomycetes. Marine drugs. 7(2):210-248. 226

Owen, J. G., Reddy, B. V. B., Ternei, M. A., Charlop-Powers, Z., Calle, P. Y., Kim, J. H. & Brady, S. F. (2013). Mapping gene clusters within arrayed metagenomic libraries to expand the structural diversity of biomedically relevant natural products. Proceedings of the National Academy of Sciences of the United States of America. 110(29):11797.

Pacific Biosciences. (2015). Procedure & Checklist - Preparing SMRTbell™ Libraries using PacBio® Barcoded Universal Primers for Multiplex SMRT® Sequencing [Online]. Available: http://www.pacb.com/wp-content/uploads/2015/09/Procedure-and-Checklist- Preparing-SMRTbell-Libraries-PacB-Barcoded-Universal-Primers.pdf [Accessed August 2015].

Pacific Biosciences. (2019). Genomic Consensus [Online]. Github. Available: https://github.com/PacificBiosciences/GenomicConsensus. [Accessed Sept 2018].

Palazzolo, A. M. E., Simons, C. L. W. & Burke, M. D. (2017). The natural productome. Proceedings of the National Academy of Sciences of the United States of America. 114(22):5564.

Panikov, N. S., Flanagan, P. W., Oechel, W. C., Mastepanov, M. A. & Christensen, T. R. (2006). Microbial activity in soils frozen to below -39 degrees C. Soil Biology & Biochemistry. 38(4):785-794.

Paraszkiewicz, K., Bernat, P., Siewiera, P., Moryl, M., Paszt, L. S., Trzciński, P., Jałowiecki, Ł. & Płaza, G. (2017). Agricultural potential of rhizospheric Bacillus subtilis strains exhibiting varied efficiency of surfactin production. Scientia Horticulturae. 225:802- 809.

Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. (2015). CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research. 25(7):1043-1055.

Paul, E., Stüwe, K., Teasdale, J. & Worley, B. (1995). Structural and metamorphic geology of the Windmill Islands, east Antarctica: Field evidence for repeated tectonothermal activity. Australian Journal of Earth Sciences. 42(5):453-469.

Payne, A., Holmes, N., Rakyan, V. & Loose, M. (2018). Whale watching with BulkVis: A graphical viewer for Oxford Nanopore bulk fast5 files. BioRxiv.bty841.

Peeters, K., Ertz, D. & Willems, A. (2011). Culturable bacterial diversity at the Princess Elisabeth Station (Utsteinen, Sør Rondane Mountains, East Antarctica) harbours many new taxa. Systematic and applied microbiology. 34(5):360-367.

Peiru, S., Gramajo, H. & Menzella, H. (2009). Design and synthesis of pathway genes for polyketide biosynthesis. Methods Enzymol. 459:319-337.

Peng, C., Wang, H., Jiang, Y., Yang, J., Lai, H. & Wei, X. (2018). Exploring the Abundance and Diversity of Bacterial Communities and Quantifying Antibiotic-Related 227

Genes Along an Elevational Gradient in Taibai Mountain, China. Microbial Ecology. 76(4):1053-1062.

Perfumo, A., Banat, I. & Marchant, R. (2018). Going Green and Cold: Biosurfactants from Low-Temperature Environments to Biotechnology Applications. Trends Biotechnol. 36:277- 289.

Pfeifer, B. A. & Khosla, C. (2001). Biosynthesis of Polyketides in Heterologous Hosts. Microbiology and Molecular Biology Reviews. 65(1):106.

Pickens, L. B., Tang, Y. & Chooi, Y.-H. (2011). Metabolic Engineering for the Production of Natural Products. Annual Review of Chemical and Biomolecular Engineering. 2(1):211- 236.

Piskozub, J. (2017). Svalbard as a study model of future High Arctic coastal environments in a warming world. Oceanologia. 59(4):612-619.

Plonka, P. M. (2006). Melanin synthesis in microorganisms - Biotechnological and medical aspects. Acta biochimica Polonica. 53(3):429-443.

Psifidi, A., Dovas, C. I., Bramis, G., Lazou, T., Russel, C. L., Arsenos, G. & Banos, G. (2015). Comparison of Eleven Methods for Genomic DNA Extraction Suitable for Large- Scale Whole-Genome Genotyping and Long-Term DNA Banking Using Blood Samples. PloS one. 10(1):e0115960.

Pudasaini, S., Wilson, J., Ji, M., van Dorst, J., Snape, I., Palmer, A. S., Burns, B. P. & Ferrari, B. C. (2017). Microbial Diversity of Browning Peninsula, Eastern Antarctica Revealed Using Molecular and Cultivation Methods. Frontiers in Microbiology. 8(591).

Pulsawat, N., Kitani, S., Fukushima, E. & Nihira, T. (2009). Hierarchical control of virginiamycin production in by three pathway-specific regulators: VmsS, VmsT and VmsR. Microbiology. 155(4):1250-1259.

Pulschen, A. A., Bendia, A. G., Fricker, A. D., Pellizari, V. H., Galante, D. & Rodrigues, F. (2017). Isolation of Uncultured Bacteria from Antarctica Using Long Incubation Periods and Low Nutritional Media. Frontiers in Microbiology. 8(1346).

Pye, C. R., Bertin, M. J., Lokey, R. S., Gerwick, W. H. & Linington, R. G. (2017). Retrospective analysis of natural products provides insights for future discovery trends. Proceedings of the National Academy of Sciences of the United States of America. 114(22):5601.

Qin, Q. L. (2014). Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation. Environmental Microbiology. 16(6):1642-1653.

228

Raaijmakers, J. M., De Bruijn, I., Nybroe, O. & Ongena, M. (2010). Natural functions of lipopeptides from Bacillus and Pseudomonas : more than surfactants and antibiotics. FEMS microbiology reviews. 34(6):1037-1062.

Rainey, F. A., Ray, K., Ferreira, M., Gatz, B. Z., Nobre, M. F., Bagaley, D., Rash, B. A., Park, M. J., Earl, A. M., Shank, N. C., Small, A. M., Henk, M. C., Battista, J. R., Kampfer, P. & da Costa, M. S. (2005). Extensive Diversity of Ionizing-Radiation-Resistant Bacteria Recovered from Sonoran Desert Soil and Description of Nine New Species of the Genus Deinococcus Obtained from a Single Soil Sample. Applied and Environmental Microbiology. 71(9):5225-5235.

Ramaciotti Centre for Genomics, R. (2015). Sample Requirements for PacBio Sequencing [Online]. Available: https://www.ramaciotti.unsw.edu.au/wp- content/uploads/2015/02/RAMAC_PacBio_Sample_Submission_guidelines_Feb_2015.pdf [Accessed Aug 2017].

Rappé, M. S. & Giovannoni, S. J. (2003). The Uncultured Microbial Majority. Annu. Rev. Microbiol. 57(1):369-394.

Rausch, C., Hoof, I., Weber, T., Wohlleben, W. & Huson, D. H. (2007). Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evolutionary Biology. 7(1):78.

Rayback, S. A. (2006). Reconstruction of summer temperature for a Canadian high arctic site from retrospective analysis of the dwarf shrub, Cassiope tetragona. Arctic, antarctic, and alpine research. 38(2):228-238.

Roongsawang, N., Lim, S. P., Washio, K., Takano, K., Kanaya, S. & Morikawa, M. (2005). Phylogenetic analysis of condensation domains in the nonribosomal peptide synthetases. FEMS Microbiology Letters. 252(1):143-151.

Roongsawang, N., Washio, K. & Morikawa, M. (2011). Diversity of Nonribosomal Peptide Synthetases Involved in the Biosynthesis of Lipopeptide Biosurfactants. International Journal of Molecular Sciences. 12(1):141-172.

Ruckert, G. (1985). Myxobacteria from Antarctic soils. Biology and Fertility of Soils. 1(4):215-216.

Rusch, D. B., Halpern, A. L., Sutton, G., Heidelberg, K. B., Williamson, S., Yooseph, S., Wu, D., Eisen, J. A., Hoffman, J. M., Remington, K., Beeson, K., Tran, B., Smith, H., Baden-Tillson, H., Stewart, C., Thorpe, J., Freeman, J., Andrews-Pfannkoch, C., Venter, J. E., Li, K., Kravitz, S., Heidelberg, J. F., Utterback, T., Rogers, Y.-H., Falcón, L. I., Souza, V., Bonilla-Rosso, G., Eguiarte, L. E., Karl, D. M., Sathyendranath, S., Platt, T., Bermingham, E., Gallardo, V., Tamayo-Castillo, G., Ferrari, M. R., Strausberg, R. L., Nealson, K., Friedman, R., Frazier, M. & Venter, J. C. (2007). The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific ( Sorcerer II GOS Expedition). PLoS Biology. 5(3):e77.

229

Sansinenea, E. & Ortiz, A. (2011). Secondary metabolites of soil Bacillus spp. Biotechnology Letters. 33(8):1523-1538.

Sant, D. G., Tupe, S. G., Ramana, C. V. & Deshpande, M. V. (2016). Fungal cell membrane-promising drug target for antifungal therapy. Journal of applied microbiology. 121(6):1498-1510.

Sarwar, A., Brader, G., Corretto, E., Aleti, G., Abaidullah, M., Sessitsch, A., Hafeez, F. Y. & Lee, S.-W. (2018). Qualitative analysis of biosurfactants from Bacillus species exhibiting antifungal activity. PloS one. 13(6):e0198107.

Scambos, T. A., Campbell, G. G., Pope, A., Haran, T., Muto, A., Lazzara, M., Reijmer, C. H. & van den Broeke, M. R. (2018). Ultralow Surface Temperatures in East Antarctica From Satellite Thermal Infrared Mapping: The Coldest Places on Earth. Geophysical Research Letters. 45(12):6124-6133.

Schäberle, T. F., Lohr, F., Schmitz, A. & König, G. M. (2014). Antibiotics from myxobacteria. Natural Product Reports. 31(7):953-972.

Schmartz, P. C., Zerbe, K., Abou-Hadeed, K. & Robinson, J. A. (2014). Bis-chlorination of a hexapeptidePCP conjugate by the halogenase involved in vancomycin biosynthesis. Org. Biomol. Chem. 12(30):5574-5577.

Schmid, M., Frei, D., Patrignani, A., Schlapbach, R., Frey, J. E., Remus-Emsermann, M. N. P. & Ahrens, C. H. (2018). Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats. Nucleic acids research. 46(17):8953-8965.

Schulze, C. J., Donia, M. S., Siqueira-Neto, J. L., Ray, D., Raskatov, J. A., Green, R. E., McKerrow, J. H., Fischbach, M. A. & Linington, R. G. (2015). Genome-Directed Lead Discovery: Biosynthesis, Structure Elucidation, and Biological Evaluation of Two Families of Polyene Macrolactams against Trypanosoma brucei. ACS chemical biology. 10(10):2373- 2381.

Sedlazeck, F. J., Lee, H., Darby, C. A. & Schatz, M. C. (2018). Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nature Reviews. 19(6):329-346.

Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-2069.

Seipke, R. F. & Loria, R. (2009). Hopanoids Are Not Essential for Growth of 87-22. Journal of bacteriology. 191(16):5216-5223.

Seppelt, R. D., Broady, P. A., Pickard, J. & Adamson, D. A. (1988). Plants and landscape in the Vestfold Hills, Antarctica. Hydrobiologia. 165(1):185-196.

Sghaier, H., Hezbri, K., Ghodhbane-Gtari, F., Pujic, P., Sen, A., Daffonchio, D., Boudabous, A., Tisa, L. S., Klenk, H.-P., Armengaud, J., Normand, P. & Gtari, M. 230

(2015). Stone-dwelling actinobacteria Blastococcus saxobsidens, Modestobacter marinus and Geodermatophilus obscurus proteogenomes. The ISME Journal. 10(1):21-29.

Shekh, R. M., Singh, P., Singh, S. M. & Roy, U. (2011). Antifungal activity of Arctic and Antarctic bacteria isolates. Polar Biology. 34(1):139-143.

Shen, B. (2015). A New Golden Age of Natural Products Drug Discovery. Cell. 163(6):1297- 1300.

Shepherd, M. D., Kharel, M. K., Bosserman, M. A. & Rohr, J. (2010). Laboratory Maintenance of Streptomyces Species. Current Protocols in Microbiology. 18(1):1-10.

Sheraton, J. W. (1983). Archaean and Proterozoic geological relationships in the Vestfold Hills-Prydz Bay area, Antarctica. BMR Journal of Australian Geology & Geophysics. 8(2):119-128.

Shi, Y., Xiang, X., Shen, C., Chu, H., Neufeld, J. D., Walker, V. K., Grogan, P. & Schloss, P. D. (2015). Vegetation-Associated Impacts on Arctic Tundra Bacterial and Microeukaryotic Communities. Applied and Environmental Microbiology. 81(2):492-501.

Shimizu, Y., Ogata, H. & Goto, S. (2017). Type III Polyketide Synthases: Functional Classification and Phylogenomics. ChemBioChem. 18:50-65.

Shimkets, L. J., Dworkin, M. & Reichenbach, H. (2006). The Myxobacteria. In: Dworkin, M., et al. (eds.) The Prokaryotes: Volume 7: Proteobacteria: Delta, Epsilon Subclass. New York, NY: Springer New York.

Shimonishi, T., Nirasawa, S. & Hayashi, K. (1999). Cloning and expression of the N- acetylmuramidase gene from Streptomyces rutgersensis H-46. Journal of Bioscience and Bioengineering. 88(4):362-367.

Shirling, E. B. & Gottlieb, D. (1966). Methods for characterization of Streptomyces species1. International journal of systematic and evolutionary microbiology. 16(3):313-340.

Shulse, C. N. & Allen, E. E. (2011). Widespread occurrence of secondary lipid biosynthesis potential in microbial lineages. PloS one. 6(5):e20146.

Siciliano, S. D., Palmer, A. S., Winsley, T., Lamb, E., Bissett, A., Brown, M. V., van Dorst, J., Ji, M., Ferrari, B. C., Grogan, P., Chu, H. & Snape, I. (2014). Soil fertility is associated with fungal and bacterial richness, whereas pH is associated with community composition in polar soil microbial communities. Soil Biology and Biochemistry. 78:10-20.

Sievers, F., Wilm, A., Dineen, D., Gibson, T., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. & Higgins, D. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology. 7:539.

231

Silva, T. R., Duarte, A. W. F., Passarini, M. R. Z., Ruiz, A. L. T. G., Franco, C. H., Moraes, C. B., de Melo, I. S., Rodrigues, R. A., Fantinatti-Garboggini, F. & Oliveira, V. M. (2018). Bacteria from Antarctic environments: diversity and detection of antimicrobial, antiproliferative, and antiparasitic activities. Polar Biology. 41(7):1505-1519.

Smith, J., Tow, L., Stafford, W., Cary, C. & Cowan, D. (2006). Bacterial Diversity in Three Different Antarctic Cold Desert Mineral Soils. Microbial Ecology. 51(4):413-421.

Soina, V. S., Mulyukin, A. L., Demkina, E. V., Vorobyova, E. A. & El-Registan, G. I. (2004). The structure of resting bacterial populations in soil and subsoil permafrost. Astrobiology. 4(3):345.

Sprague, M., Dick, J. R. & Tocher, D. R. (2016). Impact of sustainable feeds on omega-3 long-chain fatty acid levels in farmed Atlantic salmon, 2006–2015. Scientific reports. 6(1):21892.

Stasik, S., Schuster, C., Ortlepp, C., Platzbecker, U., Bornhäuser, M., Schetelig, J., Ehninger, G., Folprecht, G. & Thiede, C. (2018). An optimized targeted Next-Generation Sequencing approach for sensitive detection of single nucleotide variants. Biomolecular Detection and Quantification. 15:6-12.

Staunton, J. & Wilkinson, B. (1997). Biosynthesis of Erythromycin and Rapamycin. Chemical reviews. 97(7):2611.

Stewart, K. J., Lamb, E. G., Coxson, D. S. & Siciliano, S. D. (2011). Bryophyte- cyanobacterial associations as a key factor in N2-fixation across the Canadian Arctic. Plant and soil. 344(1):335-346.

Stewart, K. J., Snape, I. & Siciliano, S. D. (2012). Physical, chemical and microbial soil properties of frost boils at Browning Peninsula, Antarctica. Polar Biology. 35(3):463-468.

Stomeo, F., Makhalanyane, T. P., Valverde, A., Pointing, S. B., Stevens, M. I., Cary, C. S., Tuffin, M. I. & Cowan, D. A. (2012). Abiotic factors influence microbial diversity in permanently cold soil horizons of a maritime-associated Antarctic Dry Valley. FEMS Microbiology Ecology. 82(2):326-340.

Straight, P. D., Willey, J. & Kolter, R. (2006). Interactions between Streptomyces coelicolor and Bacillus subtilis: Role of surfactants in raising aerial structures. J. Bacteriol. 188(13):4918-4925.

Studholme, D. J. (2016). Genome Update. Let the consumer beware: Streptomyces genome sequence quality. Microbial Biotechnology. 9(1):3-7.

Subramani, R. & Aalbersberg, W. (2013). Culturable rare Actinomycetes : diversity, isolation and marine natural product discovery. Applied Microbiology and Biotechnology. 97(21):9291-9321.

232

Sumi, C. D., Yang, B. W., Yeo, I.-C. & Hahm, Y. T. (2015). Antimicrobial peptides of the genus Bacillus : a new era for antibiotics. Canadian Journal of Microbiology. 61(2):93-103.

Sun, S., Li, S., Avera, B. N., Strahm, B. D., Badgley, B. D. & Löffler, F. E. (2017). Soil Bacterial and Fungal Communities Show Distinct Recovery Patterns during Forest Ecosystem Restoration. Applied and environmental microbiology. 83(14):e00966-17.

Tahon, G. & Willems, A. (2017). Isolation and characterization of aerobic anoxygenic phototrophs from exposed soils from the Sør Rondane Mountains, East Antarctica. Systematic and applied microbiology. 40(6):357-369.

Tanaka, T., Kawasaki, K., Daimon, S., Kitagawa, W., Yamamoto, K., Tamaki, H., Tanaka, M., Nakatsu, C. H., Kamagata, Y. & Nojiri, H. (2014). A Hidden Pitfall in the Preparation of Agar Media Undermines Microorganism Cultivability. Applied and environmental microbiology. 80(24):7659-7666.

Tange, O. (2018). GNU Parallel 2018, Ole Tange.

Tatusov, R. L. (2000). The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic acids research. 28(1):33-36. te Welscher, Y. M., van Leeuwen, M. R., de Kruijff, B., Dijksterhuis, J. & Breukink, E. (2012). Polyene antibiotic that inhibits membrane transport proteins. Proceedings of the National Academy of Sciences of the United States of America. 109(28):11156-11159.

Tedrow, J. C. F. (2004). Polar desert soils in perspective. Eurasian Soil Science. 37(5):443- 450.

Terpe, K. (2013). Overview of thermostable DNA polymerases for classical PCR applications: from molecular and biochemical fundamentals to commercial systems. Applied Microbiology and Biotechnology. 97(24):10243-10254.

Tiwari, K. & Gupta, R. K. (2013). Diversity and isolation of rare actinomycetes: an overview. Critical Reviews in Microbiology. 39(3):256-294.

Trefzer, A., Pelzer, S., Schimana, J., Stockert, S., Bihlmaier, C., Fiedler, H., Welzel, K., Vente, A. & Bechthold, A. (2002). Biosynthetic gene cluster of simocyclinone, a natural multihybrid antibiotic. Antimicrob. Agents Chemother. 46(5):1174-1182.

Tsubokura, A., Yoneda, H. & Mizuta, H. (1999). Paracoccus carotinifaciens sp. nov., a new aerobic Gram-negative astaxanthin-producing bacterium. International Journal of Systematic Bacteriology. 49(1):277-282.

United States Geological Survey (USGS). (2008). Antarctica overview map [Online]. Available: https://lima.usgs.gov/documents/LIMA_overview_map.pdf [Accessed April 2019].

233

University of Texas Libraries (UTL). (2009). Arctic Map [Online]. Available: https://legacy.lib.utexas.edu/maps/islands_oceans_poles/?src=mappery [Accessed Aug 2018].

US Food and Drug Administration (FDA). (2014). Novel New Drugs 2014 Summary [Online]. Available: https://wayback.archive- it.org/7993/20170406032101/https://www.fda.gov/downloads/Drugs/DevelopmentApprova lProcess/DrugInnovation/UCM430299.pdf [Accessed June 2018].

US Food and Drug Administration (FDA). (2015). Novel Drugs 2015 Summary [Online]. Available: https://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/DrugInnovation/UC M485053.pdf [Accessed June 2018]. van der Donk, W. A. & Nair, S. K. (2014). Structure and mechanism of lanthipeptide biosynthetic enzymes. Current Opinion in Structural Biology. 29:58-66. van der Heul, H. U., Bilyk, B. L., McDowall, K. J., Seipke, R. F. & van Wezel, G. P. (2018). Regulation of antibiotic production in Actinobacteria: new perspectives from the post-genomic era. Natural Product Reports. 35(6):575-604. van Dijk, E. L., Jaszczyszyn, Y., Naquin, D. & Thermes, C. (2018). The Third Revolution in Sequencing Technology. Trends in Genetics. 34(9):666-681. van Dorst, J., Benaud, N. & Ferrari, B. (2017). New insights into the microbial diversity of polar desert soils: A biotechnological perspective. In: Chénard, C., et al. (eds.) Microbial Ecology of Extreme Environments. Springer. van Dorst, J., Bissett, A., Palmer, A. S., Brown, M., Snape, I., Stark, J. S., Raymond, B., McKinlay, J., Ji, M., Winsley, T. & Ferrari, B. C. (2014). Community fingerprinting in a sequencing world. FEMS Microbiology Ecology. 89(2):316-330. van Dorst, J. M., Hince, G., Snape, I. & Ferrari, B. C. (2016). Novel Culturing Techniques Select for Heterotrophs and Hydrocarbon Degraders in a Subantarctic Soil. Scientific Reports. 6:36724.

Velicer, G. J. & Yu, Y.-T. N. (2003). Evolution of novel cooperative swarming in the bacterium Myxococcus xanthus. Nature. 425(6953):75.

Ventola, C. L. (2015). The antibiotic resistance crisis: Part 1: causes and threats. Pharmacy and Therapeutics. 40(4):277-283.

Verleyen, E., Hodgson, D. A., Sabbe, K., Cremer, H., Emslie, S. D., Gibson, J., Hall, B., Imura, S., Kudoh, S., Marshall, G. J., McMinn, A., Melles, M., Newman, L., Roberts, D., Roberts, S. J., Singh, S. M., Sterken, M., Tavernier, I., Verkulich, S. & de Vyver, E. V. (2011). Post-glacial regional climate variability along the East Antarctic coastal margin— Evidence from shallow marine and coastal terrestrial records. Earth-science reviews. 104(4):199-212. 234

Vicente, C., Thibessard, A., Lorenzi, J.-N., Benhadj, M., Hôtel, L., Gacemi-Kirane, D., Lespinet, O., Leblond, P. & Aigle, B. (2018). Comparative Genomics among Closely Related Streptomyces Strains Revealed Specialized Metabolite Biosynthetic Gene Cluster Diversity. Antibiotics. 7(4):86.

Walter, M. H. & Strack, D. (2011). Carotenoids and their cleavage products: Biosynthesis and functions. Natural Product Reports. 28(4):663-692.

Wang, H., Fewer, D. P., Holm, L., Rouhiainen, L. & Sivonen, K. (2014). Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes. Proceedings of the National Academy of Sciences of the United States of America. 111(25):9259-9264.

Wang, Y., Zhang, L., Zhang, X., Huang, J., Zhao, Y., Zhao, Y., Liu, J., Huang, C., Wang, J., Hu, Y., Ren, G. & Xu, X. (2017). Geodermatophilus daqingensis sp. nov., isolated from petroleum-contaminated soil. Journal of Microbiology. 110(6):803-809.

Warren, G. & David, J. S. (1993). Identification of protein coding regions by database similarity search. Nature Genetics. 3(3):266.

Watve, M. (2000). The 'K' selected oligophilic bacteria: A key to uncultured diversity? Current Science. 78(12):1535-1542.

Watve, M., Tickoo, R., Jog, M. & Bhole, B. (2001). How many antibiotics are produced by the genus Streptomyces? Archives of Microbiology. 176(5):386-390.

Weber, T., Blin, K., Duddela, S., Krug, D., Kim, H. U., Bruccoleri, R., Lee, S. Y., Fischbach, M. A., Müller, R., Wohlleben, W., Breitling, R., Takano, E. & Medema, M. H. (2015). antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic acids research. 43(W1):W237-W243.

Weissman, K. J. (2015a). The structural biology of biosynthetic megaenzymes. Nature chemical biology. 11(9):660.

Weissman, K. J. (2015b). Uncovering the structures of modular polyketide synthases. Nat. Prod. Rep. 32(3):436-453.

Wenzel, S. C. & Müller, R. (2007). Myxobacterial natural product assembly lines: fascinating examples of curious biochemistry. Natural Product Reports. 24(6):1211-1224.

Wenzel, S. C. & Müller, R. (2009). The impact of genomics on the exploitation of the myxobacterial secondary metabolome. Natural Product Reports. 26(11):1385-1407.

Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis, New York, Springer- Verlag.

235

Wickham, H. (2011). ggplot2. Wiley Interdisciplinary Reviews: Computational Statistics. 3(2):180-185.

Wilkins, D., Yau, S., Williams, T. J., Allen, M. A., Brown, M. V., Demaere, M. Z., Lauro, F. M. & Cavicchioli, R. (2013). Key microbial drivers in Antarctic aquatic environments. FEMS Microbiology Review. 37:303-335.

Williams, L., Borchhardt, N., Colesie, C., Baum, C., Komsic-Buchmann, K., Rippin, M., Becker, B., Karsten, U. & Büdel, B. (2017). Biological soil crusts of Arctic Svalbard and of Livingston Island, Antarctica. Polar Biology. 40(2):399-411.

Wolin, E. A., Wolin, M. J. & Wolfe, R. S. (1963). Formation of Methane by Bacterial Extracts. Journal of Biological Chemistry. 238:2882-2886.

Wong, S. 2018. Life on the Edge. Bachelor of Biotechnology (Honours), University of New South Wales.

Woodhouse, J. N., Fan, L., Brown, M. V., Thomas, T. & Neilan, B. A. (2013). Deep sequencing of non-ribosomal peptide synthetases and polyketide synthases from the microbiomes of Australian marine sponges. The ISME Journal. 7(9):1842-1851.

Work, T. T., Jacobs, J. M., Spence, J. R. & Volney, W. J. (2010). High levels of green- tree retention are required to preserve ground beetle biodiversity in boreal mixedwood forests. Ecological applications. 20(3):741-751.

World Health Organisation (WHO). (2014). Antimicrobial Resistance: Global Report on Surveillance [Online]. Available: http://apps.who.int/iris/bitstream/10665/112642/1/9789241564748_eng.pdf [Accessed June 2018].

World Health Organisation (WHO). (2015a). Global action plan on antimicrobial resistance [Online]. Available: https://docs.google.com/viewer?url=http%3A%2F%2Fwww.who.int%2Firis%2Fbitstream %2F10665%2F193736%2F1%2F9789241509763_eng.pdf%3Fua%3D1 [Accessed May 2018].

World Health Organisation (WHO) (2015b). Report on global sexually transmitted infection surveillance. Available: http://apps.who.int/iris/bitstream/handle/10665/249553/9789241565301-eng.pdf [Accessed 25/06/2018 Access 2015b].

World Health Organisation (WHO). (2017). Global tuberculosis report [Online]. Available: http://apps.who.int/iris/bitstream/handle/10665/259366/9789241565516-eng.pdf [Accessed June 2018].

Wu, L.-F., He, H.-Y., Pan, H.-X., Han, L., Wang, R. & Tang, G.-L. (2014). Characterization of QmnD3/QmnD4 for Double Bond Formation in Quartromicin Biosynthesis. Organic letters. 16(6):1578-1581. 236

Wyres, K. L. & Holt, K. E. (2018). Klebsiella pneumoniae as a key trafficker of drug resistance genes from environmental to clinically important bacteria. Current Opinion in Microbiology. 45:131-139.

Xie, Y., Wright, S., Shen, Y. & Du, L. (2012). Bioactive natural products from Lysobacter. Nat. Prod. Rep. 29(11):1277-1287.

Yang, L., Liu, Y., Wu, H., Song, Z., Høiby, N., Molin, S. & Givskov, M. (2012). Combating biofilms. FEMS Immunology and Medical Microbiology. 65(2):146-157.

Yau, S., Lauro, F. M., Williams, T. J., Demaere, M. Z., Brown, M. V., Rich, J., Gibson, J. A. & Cavicchioli, R. (2013). Metagenomic insights into strategies of carbon conservation and unusual sulfur biogeochemistry in a hypersaline Antarctic lake. The ISME Journal. 7(10):1944.

Yergeau, E., Bokhorst, S., Kang, S., Zhou, J., Greer, C. W., Aerts, R. & Kowalchuk, G. A. (2012). Shifts in soil microorganisms in response to warming are consistent across a range of Antarctic environments. The ISME Journal. 6(3):692-702.

Yergeau, E., Newsham, K. K., Pearce, D. A. & Kowalchuk, G. A. (2007). Patterns of bacterial diversity across a range of Antarctic terrestrial habitats. Environmental microbiology. 9(11):2670-2682.

Yi Pan, S., Tan, G., Convey, P., Pearce, D. A. & Tan, I. K. (2013). Diversity and bioactivity of actinomycetes from Signy Island terrestrial soils, maritime Antarctic. Advances in Polar Science. 24(4):208-212.

Yu, Y., Tang, B., Dai, R., Zhang, B., Chen, L., Yang, H., Zhao, G. & Ding, X. (2018). Identification of the streptothricin and tunicamycin biosynthetic gene clusters by genome mining in Streptomyces sp. strain fd1-xmd. Applied Microbiology and Biotechnology. 102(6):2621-2633.

Zaburannyi, N., Rabyk, M., Ostash, B., Fedorenko, V. & Luzhetskyy, A. (2014). Insights into naturally minimised Streptomyces albus J1074 genome. BMC genomics. 15(1):97.

Zdanowski, M., Żmuda-Baranowska, M., Borsuk, P., Świątecki, A., Górniak, D., Wolicka, D., Jankowska, K. & Grzesiak, J. (2013). Culturable bacteria community development in postglacial soils of Ecology Glacier, King George Island, Antarctica. Polar Biology. 36(4):511-527.

Zengler, K. (2009). Central Role of the Cell in Microbial Ecology. Microbiology and Molecular Biology Reviews. 73(4):712-729.

Zhang, E. 2016. Dynamic Roles of a Secondary Metabolite Gene in the Ecology of Polar Soils. Bachelor of Advanced Science (Honours), University of New South Wales.

237

Zhang, E., M Thibaut, L., Terauds, A., Wong, S., van Dorst, J., M Tanaka, M. & Ferrari, B. (2019). Extreme niche partitioning promotes a remarkably high diversity of soil microbiomes across eastern Antarctica [Online]. Available: https://www.researchgate.net/publication/331323288_Extreme_niche_partitioning_promote s_a_remarkably_high_diversity_of_soil_microbiomes_across_eastern_Antarctica [Accessed March 2019].

Zhang, X., King‐Smith, E. & Renata, H. (2018). Total Synthesis of Tambromycin by Combining Chemocatalytic and Biocatalytic C−H Functionalization. Angewandte Chemie. 57(18):5037-5041.

Zhao, J., Yang, N., Chen, X., Jiang, Q. & Zeng, R. (2011). Phylogenetic diversity of Type I polyketide synthase genes from sediments of Ardley Island in Antarctica. Acta Oceanologica Sinica. 30(6):104-111.

Zhao, J., Yang, N. & Zeng, R. (2008). Phylogenetic analysis of type I polyketide synthase and nonribosomal peptide synthetase genes in Antarctic sediment. Extremophiles: Life Under Extreme Conditions. 12(1):97-105.

Zhu, H., Sandiford, S. & Wezel, G. (2014). Triggers and cues that activate antibiotic production by actinomycetes. Official Journal of the Society for Industrial Microbiology and Biotechnology. 41(2):371-386.

Zhu, R., Shi, Y., Ma, D., Wang, C., Xu, H. & Chu, H. (2015). Bacterial diversity is strongly associated with historical penguin activity in an Antarctic lake sediment profile. Scientific Reports. 5(1):17231.

Zhu, S., Wang, X., Zhang, D., Jing, X., Zhang, N., Yang, J. & Chen, J. (2016). Complete Genome Sequence of Hemolysin-Containing Carnobacterium sp. Strain CP1 Isolated from the Antarctic. Genome Announcements. 4(4):e00690-16.

Ziemert, N., Podell, S., Penn, K., Badger, J. H., Allen, E., Jensen, P. R. & de Crécy- Lagard, V. (2012). The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity. PloS one. 7(3):e34064.

Zotchev, S. B., Johnsen, G., Fjærvik, E. & Bredholt, H. (2008). Actinomycetes from Sediments in the Trondheim Fjord, Norway: Diversity and Biological Activity. Marine drugs. 6(1):12-24.

238

239

APPENDICES

APPENDIX 1 (CHAPTER 2)

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms.

PKS Closest sequence match Sequ. Sequ. ASV# %G+C Accession (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 1 68 CP003219.1 Streptomyces cattleya_type I PKS 68 SDI33674.1 Alloactinosynnema album_PKS 62 2 66 CP003219.1 Streptomyces cattleya_type I PKS 73 SDI33633.1 Alloactinosynnema album_PKS 31 3 66 JF970188.1 Amycolatopsis 68 AFI57005.1 Amycolatopsis 35 orientalis_quartromicin_PKS orientalis_quartromycin PKS 4 68 CP014060.1 Achromobacter 84 SEO68750.1 Amycolatopsis 65 xylosoxidans_hypothetical protein saalfeldensis_oxidoreductase 5 73 AJ278573.1 Streptomyces 70 SEO77265.1 Amycolatopsis saalfeldensis_polyene 50 natalensis_pimaricin_PKS macrolide PKS 6 67 DQ897667.1 Polyangium cellulosum_ambruticin 69 KJC40904.1 Bradyrhizobium sp._PKS 78 AmbD_PKS 7 69 CP002830.1 Myxococcus fulvus_uncharacterised 66 CUS36179.1 Candidatus Nitrospira nitrosa_PKS 51

8 60 CP012851.1 Persicobacter sp._uncharacterised 68 SDM25909.1 Catalinimonas alkaloidigena_PKS 66 9 66 CP001700.1 Catenulispora acidiphila_PKS 69 WP_015795562.1 Catenulispora acidiphila_PKS 63 10 68 CP006850.1 Nocardia nova_nitroreductase 79 KRT62639.1 Chloroflexi_malate synthase 58 11 68 CP006850.1 Nocardia nova_nitroreductase 79 OGO54799.1 Chloroflexi_malate synthase 58 12 69 CP003364.1 Singulisphaera acidiphila_PKS 65 WP_016872856.1 Chlorogloeopsis fritschii_PKS 53

240

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. Closest sequence match sim. sim. ASV# %G+C Accession (nucleotide) (%) Accession Closest sequence match (protein) (%) 13 67 AP010968.1 setae_PKS 68 WP_052387098.1 Dactylosporangium aurantiacum_PKS 53

14 62 CP003382.1 Deinococcus 64 WP_029477642.1 Deinococcus 49 peraridilitoris_phosphodiesterase- frigens_metallophosphatase/nucleotidase like hydrolase 15 65 CP002047.1 Streptomyces 70 WP_007513999.1 Frankia sp._phosphatase 35 bingchenggensis_phosphatase 16 67 CP007128.1 Gemmatirosa 70 ODT02251.1 Gemmatimonadetes_uncharacterised 62 kalamazoonesis_hypothetical protein

17 68 AY596297.1 Haloarcula marismortui_dTDP- 67 WP_070364766.1 Haloarchaeon_GDP-mannose 4,6 DH 35 glucose-4,6-DH 18 68 CP011564.1 Halanaeroarchaeum 70 WP_011223374.1 Haloarcula marismortui_GDP-mannose 53 sulfurireducens_dTDP-glucose-4,6- 4,6 DH DH 19 67 CP011564.1 Halanaeroarchaeum 70 WP_066414393.1 Halorubrum sp._GDP-mannose 4,6 DH 43 sulfurireducens_dTDP-glucose-4,6- DH 20 68 CP011564.1 Halanaeroarchaeum 71 WP_049982662.1 Halorubrum sp._GDP-mannose 4,6 DH 62 sulfurireducens_dTDP-glucose-4,6- DH 21 67 CP010849.1 Streptomyces 74 WP_062432670.1 Herbidospora daliensis_PKS 51 cyaneogriseus_hypothetical protein 22 75 KP742963.1 Streptomyces sp._heronamide PKS 69 WP_062342606.1 Herbidospora sakaeratensis_PKS 50 241

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. Closest sequence match sim. sim. ASV# %G+C Accession (nucleotide) (%) Accession Closest sequence match (protein) (%) 23 72 CP003219.1 Streptomyces cattleya_type I 67 WP_062352166.1 Herbidospora yilanensis_PKS 62 PKS 24 73 AJ278573.1 Streptomyces 70 SES41714.1 Lentzea albida_PKS 37 natalensis_pimaricin PKS 25 69 CP012590.1 Actinomyces sp_glycosyl 79 WP_052464447.1 Methyloceanibacter caenitepidi_DNA 42 transferase primase/polymerase 26 74 CP000850.1 Salinispora arenicola_beta- 73 SCG19052.1 Micromonospora echinofusca_6- 67 ketoacyl synthase methylsalicylic acid synthase 27 64 CP002047.1 Streptomyces 70 WP_014740742.1 Modestobacter marinus_phosphatase 45 bingchenggensis_phosphatase 28 68 CP002830.1 Myxococcus fulvus_type I PKS 66 WP_070392794.1 Moorea producens_PKS 52 29 68 CP002830.1 Myxococcus fulvus_type I PKS 66 SDE49951.1 Myxococcus virescens_PKS 52 30 64 CP003219.1 Streptomyces 70 SDJ25484.1 Nonomuraea 41 cattleya_phosphatase maritima_endo/exonuclease/phosphatase

31 61 AB568601.1 Streptomyces sp._reveromycin 73 WP_012409595.1 Nostoc punctiforme_PKS 47 PKS 32 66 CP002047.1 Streptomyces 71 SES38665.1 Phycicoccus cremeus_phosphatase 70 bingchenggensis_phosphatase 33 70 CP011868.1 Pseudonocardia 76 WP_060714094.1 Pseudonocardia sp._transposase 69 sp._hypothetical protein 34 75 KP742963.1 Streptomyces sp._heronamide 71 WP_037075676.1 Pseudonocardia spinosispora_PKS 69 PKS 35 74 KP742963.1 Streptomyces sp._heronamide 70 WP_051341809.1 Pseudonocardia spinosispora_PKS 53 PKS 242

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 36 73 AP010968.1 Kitasatospora setae_modular PKS 70 WP_033442118.1 Saccharothrix sp._PKS 60 37 68 CP003219.1 Streptomyces cattleya_type I PKS 70 WP_051772302.1 Saccharothrix sp._PKS 39 38 67 CP001700.1 Catenulispora acidiphila_acyl 70 WP_033438928.1 Saccharothrix sp._PKS 45 transferase 39 73 CP000850.1 Salinispora arenicola_beta-ketoacyl 73 WP_029537616.1 Salinispora arenicola_PKS 67 synthase 40 72 CP001804.1 Haliangium ochraceum_6- 71 BAG69052.1 Sorangium cellulosum_PKS 57 deoxyerythronolide-B synthase 41 67 CP003969.1 Sorangium cellulosum_hypothetical 69 WP_013376005.1 Stigmatella aurantiaca_PKS 52 protein 42 73 AJ871581.1 Streptomyces achromogenes_rubradirin 81 CAI94682.1 Streptomyces achromogenes_PKS 73 gene cluster

43 72 GQ981380.1 Sorangium cellulosum _thuggacin PKS 71 WP_040255771.1 Streptomyces albus_PKS 49

44 74 AF324838.2 Streptomyces 72 AEU17899.1 Streptomyces 64 antibioticus_simocyclinone PKS antibioticus_simocyclinone PKS 45 67 CP002047.1 Streptomyces 69 WP_033355444.1 Streptomyces 40 bingchenggensi_phosphatase aureofaciens_phosphatase 46 72 CP006871.1 Streptomyces albulus_PKS 70 WP_014174583.1 Streptomyces 58 bingchenggensis_PKS 47 73 CP003987.1 Streptomyces sp._Erythronolide 69 WP_053927100.1 Streptomyces 38 synthase chattanoogensis_PKS 48 64 CP005080.1 Streptomyces fulvissimus_epoxide 80 WP_060893441.1 Streptomyces 82 hydrolase europaeiscabiei_epoxide hydrolase

243

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Closest sequence match Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession (protein) sim. (%) 49 75 AP010968.1 Kitasatospora setae_PKS 71 WP_051727297.1 Streptomyces griseus_PKS 98 50 65 CP002047.1 Streptomyces 71 WP_009716296.1 Streptomyces 41 bingchenggensis_phosphatase himastatinicus_phosphatase 51 70 LN997842.1 Streptomyces 68 AAQ20787.1 Streptomyces hygroscopicus_PKS 54 reticuli_phenolphthiocerol PKS 52 73 CP006567.1 Streptomyces 91 AAQ20780.1 Streptomyces hygroscopicus_PKS 59 rapamycinicus_hypothetical protein 53 66 KT209587.1 Micromonospora sp._lobosamide PKS 69 WP_060954383.1 Streptomyces hygroscopicus_PKS 58

54 73 AP010968.1 Kitasatospora setae_modular PKS 70 CDR03059.1 Streptomyces iranensis_PKS 57 55 65 CP003219.1 Streptomyces cattleya_phosphatase 70 WP_046927220.1 Streptomyces 78 lydicus_phosphatase 56 69 CP010519.1 Streptomyces albus_modular PKS 71 SED16442.1 Streptomyces 39 melanosporofaciens_PKS 57 74 AJ278573.1 Streptomyces natalensis_pimaricin 71 WP_067359169.1 Streptomyces noursei_PKS 62 PKS 58 69 FJ545274.1 Streptomyces 71 WP_069777516.1 Streptomyces puniciscabiei_PKS 59 antibioticus_indanomycin PKS 59 69 JX504844.1 Streptomyces sp._hygrocin PKS 69 WP_069776473.1 Streptomyces puniciscabiei_PKS 42 60 72 CP006567.1 Streptomyces 94 AGP59275.1 Streptomyces 94 rapamycinicus_hypothetical protein rapamycinicus_PKS 61 72 CP006567.1 Streptomyces 94 AGP59291.1 Streptomyces 78 rapamycinicus_hypothetical protein rapamycinicus_PKS 62 70 CP003987.1 Streptomyces sp._modular PKS 71 KJS52374.1 Streptomyces 65 rubellomurinus_PKS

244

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. sim. Closest sequence match Sequ. sim. ASV# %G+C Accession Closest sequence match (nucleotide) (%) Accession (protein) (%) 63 74 AJ278573.1 Streptomyces natalensis_pimaricin PKS 73 WP_051746472.1 Streptomyces 72 scopuliridis_PKS 64 73 AB469193.1 Streptomyces graminofaciens_FD-891 69 SCK25970.1 Streptomyces sp._PKS 59 PKS 65 72 CP011799.1 Streptomyces sp._hypothetical protein 71 KJY24546.1 Streptomyces sp._PKS 61

66 74 AF324838.2 Streptomyces antibioticus_simocyclinone 71 KOV34270.1 Streptomyces sp._PKS 41 PKS 67 73 AJ278573.1 Streptomyces natalensis_pimaricin PKS 71 WP_064273739.1 Streptomyces sp._PKS 39

68 72 LK022848.1 Streptomyces iranensis_type I PKS 72 ACL97724.1 Streptomyces sp._PKS 61 69 66 KT209587.1 Micromonospora sp._lobosamide PKS 69 WP_064455734.1 Streptomyces sp._PKS 58

70 73 HE648167.1 Streptomyces hygroscopicus_antifungal L- 70 WP_051906038.1 Streptomyces sp._PKS 61 155,175 PKS

71 74 AF324838.2 Streptomyces antibioticus_simocyclinone 71 WP_039633289.1 Streptomyces sp._PKS 36 PKS 72 69 CP010519.1 Streptomyces albus_modular PKS 70 WP_007269166.1 Streptomyces sp._PKS 35 73 72 HE648167.1 Streptomyces hygroscopicus_antifungal L- 72 WP_052744222.1 Streptomyces sp._PKS 62 155,175 PKS

74 68 CP010519.1 Streptomyces albus_modular PKS 71 WP_069860838.1 Streptomyces sp._PKS 36 75 69 CP010519.1 Streptomyces albus_modular PKS 71 WP_018840958.1 Streptomyces sp._PKS 55

245

Appendix Table A1.1 Taxonomic classification of PKS KS/AT domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. sim. Closest sequence match Sequ. sim. ASV# %G+C Accession Closest sequence match (nucleotide) (%) Accession (protein) (%) 76 75 CP011799.1 Streptomyces sp._hypothetical protein 70 WP_037770990.1 Streptomyces sp._PKS 57

77 71 AB284188.1 Streptomyces lasaliensis_modular PKS 73 WP_062204312.1 Streptomyces sp._PKS 79

78 71 KT209587.1 Micromonospora sp._lobosamide PKS 70 WP_046501326.1 Streptomyces sp._PKS 53

79 66 CP002047.1 Streptomyces bingchenggensis_type I 69 WP_051807727.1 Streptomyces sp._PKS 64 PKS 80 73 HE648167.1 Streptomyces hygroscopicus_antifungal 71 KJK33802.1 Streptomyces 61 L-155,175 PKS variegatus_PKS

81 70 CP001700.1 Catenulispora acidiphila_acyl transferase 74 AEM86155.1 Streptomyces 62 violaceusniger_PKS 82 67 AP010968.1 Kitasatospora setae_modular PKS 68 WP_051787920.1 Streptomyces 56 wedmorensis_PKS

246

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms.

NRPS Closest sequence match Sequ. Sequ. ASV# %G+C Accession (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 1 68 LN831029.1 Achromobacter xylosoxidans_NRPS 74 EFV85595.1 Achromobacter xylosoxidans_NRPS 66 2 69 LN831029.1 Achromobacter xylosoxidans_NRPS 75 EFV85595.1 Achromobacter xylosoxidans_NRPS 65 3 70 KR062371.1 Streptomyces sp._haoxinamide 68 KHD09577.1 Actinokineospora inagensis_NRPS 64 NRPS 4 72 CP001630.1 Actinosynnema mirum_NRPS 79 WP_006929441.1 Actinosynnema mirum_NRPS 76 5 70 LT629701.1 Allokutzneria albata_NRPS 74 OEU80475.1 Allokutzneria albata_NRPS 53 6 70 CP011799.1 Streptomyces sp._siderophore NRPS 73 CDG85848.1 Amycolatopsis halophila_NRPS 66 7 68 CP002600.1 Burkholderia gladioli_NRPS 70 WP_045823724.1 Amycolatopsis mediterranei_NRPS 41 8 72 CP020039.1 Streptomyces sp._NRPS 71 SEU13784.1 Amycolatopsis taiwanensis_NRPS 75 9 63 CP004370.1 Streptomyces albus_PKS/NRPS 71 SDS85241.1 Anabaena sp._NRPS 60 10 73 KF170355.1 Streptomyces 67 ABA23702.1 Anabaena variabilis_NRPS 54 ansochromogenes_nikkomycin NRPS 11 71 CP013220.1 Streptomyces hygroscopicus_acyl- 72 KRT78094.1 Armatimonadetes sp._NRPS 64 CoA synthetase 12 68 CP003347.1 Mycobacterium sp._gramicidin 96 WP_052659492.1 Bacillus alveayuensis_ATP- 34 NRPS dependent acyl-CoA ligase 13 69 CP002994.1 Streptomyces violaceusniger_NRPS 70 AEB21504.1 Bacillus amyloliquefaciens_iturin 49 NRPS 14 74 CP012109.1 Myxococcus hansupus_siderophore 76 EEK70326.1 Bacillus cereus_hypothetical NRPS 58 NRPS 15 72 CP001804.1 Haliangium ochraceum_NRPS 77 KOA72794.1 Bacillus 53 stratosphericus_hypothetical NRPS

247

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Closest sequence match Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession (protein) sim. (%) 16 72 AP010968.1 Kitasatospora setae_hybrid NRPS/PKS 69 WP_008971883.1 Bradyrhizobium sp._hybrid 59 NRPS/PKS 17 69 KX707969.1 Streptomyces sp._NRPS 68 WP_051188157.1 Brevibacillus thermoruber_NRPS 51 18 72 CP011509.1 Archangium gephyra_NRPS 70 WP_051188157.1 Brevibacillus thermoruber_NRPS 55 19 68 CP006003.1 Myxococcus fulvus_hypothetical protein 69 WP_071803044.1 Brevibacillus thermoruber_NRPS 54 20 70 LT607753.1 Micromonospora coxensis_NRPS 66 WP_051188157.1 Brevibacillus thermoruber_NRPS 50 21 68 CP002399.1 Micromonospora sp._NRPS 73 KIX33763.1 Burkholderia 62 pseudomallei_NRPS 22 74 CP002162.1 Micromonospora aurantiaca_NRPS 71 KIX33763.1 Burkholderia 63 pseudomallei_NRPS 23 71 CP010415.1 Azotobacter chroococcum_NRPS 71 KIX33763.1 Burkholderia 60 pseudomallei_NRPS 24 70 LT594324.1 Micromonospora 70 WP_052485744.1 Burkholderia sp._NRPS 56 narathiwatensis_NRPS 25 74 CP009322.1 Burkholderia gladioloi_D-alanine-poly 70 WP_060122821.1 Burkholderia 52 (phosphoribitol) ligase vietnamiensis_hybrid NRPS/PKS 26 75 FP885907.1 Ralstonia solanacearum_Glutamate 70 WP_073619435.1 Caldithrix abyssi_NRPS 47 racemase 27 65 CP015098.1 Streptomyces sp._hypothetical protein 71 WP_052754440.1 Calothrix sp._NRPS 77 28 72 CP013446.1 Burkholderia ubonensis_NRPS 75 ETX02901.1 Calothrix sp._NRPS 64 29 74 LT607411.1 Micromonospora viridifaciens_NRPS 68 WP_030164825.1 Candidatus Entotheonella 58 sp._hypothetical NRPS 30 66 CP003969.1 Sorangium cellulosum_hypothetical 75 OAD22016.1 Candidatus Thiomargarita 61 protein nelsonii_hypothetical NRPS 31 74 KP006601.1 Eleftheria terrae_Teixobactin gene 69 OAD22016.1 Candidatus Thiomargarita 53 cluster nelsonii_hypothetical NRPS

248

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Closest sequence match Sequ. Sequ. ASV# %G+C Accession (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 32 67 LT559118.1 Nonomuraea sp._NRPS/PKS 74 OAD22016.1 Candidatus Thiomargarita 57 nelsonii_NRPS 33 71 CP020567.1 Streptomyces aureofaciens_NRPS 70 WP_050433124.1 Candidatus Thiomargarita 54 nelsonii_NRPS 34 64 CP006871.1 Streptomyces albulus_thioester 70 WP_050433124.1 Chondromyces crocatus_hybrid 59 reductase NRPS/PKS 35 72 CP007130.1 Gemmatirosa kalamazoonesis_NRPS 70 WP_050433036.1 Chondromyces crocatus_hybrid 61 NRPS/PKS 36 69 CP019779.1 Streptomyces sp._hypothetical protein 71 WP_055409225.1 Chondromyces crocatus_NRPS 78 37 66 CP003389.1 Corallococcus coralloides_hybrid 73 WP_051188157.1 Corallococcus coralloides_hybrid 67 NRPS/PKS NRPS/PKS 38 70 AM420293.1 Saccharopolyspora erythraea_NRPS 80 WP_015204113.1 Couchioplanes caeruleus_NRPS 79 39 66 LN831790.1 Streptomyces 73 AHV79174.1 Crinalium epipsammum_NRPS 51 leeuwenhoekii_gramacidin NRPS 40 74 CP016793.1 Lentzea guizhouensis_hypothetical 72 AIW82284.1 Cyanobacteria sp._hypothetical 60 protein NRPS 41 71 KF170355.1 Streptomyces 67 WP_071904628.1 Cylindrospermum 38 ansochromogenes_nikkomycin NRPS alatosporum_NRPS 42 68 CP002047.1 Streptomyces binchenggensis_NRPS 71 OEU84093.1 Cystobacter ferrugineus_hypothetical 62 hybrid NRPS/PKS 43 66 CP001700.1 Catenulispora acidiphila_NRPS 69 OEU84093.1 Desulfobacterales sp._hypothetical 65 NRPS 44 71 HE971709.1 Streptomyces davawensis_tyrocidine 76 OEU84093.1 Desulfobacterales sp._hypothetical 54 NRPS NRPS 249

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 45 70 KT362217.1 Streptomyces calvus_WS9326 NRPS 73 OEU84093.1 Desulfobacterales sp._hypothetical 53 NRPS 46 72 KP756960.1 Streptomyces canus_telomycin NRPS 73 OEU84093.1 Desulfobacterales sp._hypothetical 50 NRPS 47 69 AB698636.1 Streptomyces turgidscabies_hypothetical 73 OEU84093.1 Desulfobacterales sp._hypothetical 59 PKS/NRPS NRPS 48 69 HE575208.1 Streptomyces sp._collismycin A NRPS 85 OEU84093.1 Desulfobacterales sp._hypothetical 53 NRPS 49 72 KF264564.1 Catenulispora acidiphila_NRPS 68 WP_030430619.1 Desulfobacterales sp._hypothetical 55 NRPS 50 71 CP011340.1 Streptomyces 75 OEU84093.1 Desulfobacterales sp._hypothetical 72 pristinaespiralis_pristinamycin NRPS NRPS 51 66 LT607750.1 Micromonospora echinofusca_NRPS 75 SDJ90284.1 Desulfobacterales sp._hypothetical 49 NRPS 52 68 CP010849.1 Streptomyces 70 WP_017308482.1 Dyella jiangningensis_NRPS 39 cyanoeogriseus_hypothetical protein 53 71 KF264564.1 Streptomyces purpeofuscus_NRPS 68 WP_026723921.1 Fischerella sp._NRPS 44 54 67 KC876490.1 Streptomyces sp._NRPS 67 WP_005549942.1 Fischerella sp._NRPS 55 55 71 CP011667.1 Streptomyces sp._hypothetical protein 78 WP_053458069.1 Fischerella sp._NRPS 52 56 67 LT629775.1 Streptomyces sp._NRPS 78 WP_062656581.1 Frankia sp._NRPS 84 57 69 CP004025.1 Myxococcus stipitatus_NRPS 74 ABX04518.1 Hapalosiphon sp._NRPS 71 58 65 JN596952.1 Lysobacter enzymogenes_WAPS NRPS 71 ABX04517.1 Herpetosiphon aurantiacus_NRPS 58 59 71 CP011522.1 Streptomyces sp._thioester reductase 84 WP_052750824.1 Herpetosiphon aurantiacus_NRPS 46

250

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Closest sequence match Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession (protein) sim. (%) 60 67 CP006996.1 Rhodococcus 73 WP_051399675.1 Hyphomicrobium sp._NRPS 65 pyridinivorans_hypothetical protein 61 61 CP016559.1 Streptomyces clavuligerus_hypothetical 71 WP_054291157.1 Janthinobacterium 64 protein agaricidamnosum_NRPS 62 73 LN877229.1 Kibdelosporangium sp._siderophore 80 WP_052478487.1 Kibdelosporangium 82 NRPS phytohabitans_NRPS 63 75 CP012752.1 Kibdelosporangium 82 WP_020387122.1 Kibdelosporangium sp._NRPS 84 phytohabitans_NRPS 64 73 KX708190.1 Streptomyces sp._NRPS 78 WP_020384980.1 Kribbella 74 catacumbae_hypothetical NRPS 65 66 AB432565.1 Streptomyces abikoensis_NRPS 74 WP_020384980.1 Kribbella catacumbae_NRPS 89 66 70 KF170330.1 Streptomyces 73 AHH97300.1 Kribbella catacumbae_NRPS 88 ansochromogenes_nikkomycin NRPS 67 67 CP007155.1 Kutzneria albida_NRPS 71 AKU98435.1 Kutzneria albida_NRPS 63 68 71 CP012333.1 Labilithrix luteola_NRPS/PKS 82 WP_068295934.1 Labilithrix luteola_hybrid 85 NRPS/PKS 69 60 CP011664.1 Streptomyces sp._hypothetical protein 67 SDJ42953.1 Labrys sp._hypothetical NRPS 52 70 74 LN850107.1 Alloactinosynnema sp._siderophore 75 WP_013226636.1 Lentzea violacea_NRPS 78 NRPS 71 73 CP013141.1 Lysobacter antibioticus_NRPS 72 WP_057917623.1 Lysobacter antibioticus_NRPS 57 72 69 FP885896.1 Ralstonia solanacearum_NRPS/PKS 72 WP_057917623.1 Lysobacter antibioticus_NRPS 58 73 71 AL646053.1 Ralstonia solanacearum_NRPS 71 WP_057917627.1 Lysobacter antibioticus_NRPS 66 74 68 CP014517.1 Variovorax sp._thioester reductase 71 WP_052103331.1 Lysobacter concretionis_hybrid 79 NRPS/PKS

251

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 75 73 CP008884.1 Dyella japonica_thioester reductase 70 WP_052103331.1 Lysobacter concretionis_hybrid 67 NRPS/PKS 76 70 CP006259.1 Streptomyces collinus_NRPS 72 WP_055900042.1 Lysobacter sp._hybrid NRPS/PKS 70 77 71 CP011371.1 Polyangium 82 WP_052197850.1 Methylibium sp._NRPS/PKS 85 brachysporum_NRPS/PKS 78 64 CP020567.1 Streptomyces aureofaciens_NRPS 74 WP_024968867.1 Microcystis aeruginosa_NRPS 39 79 71 CP007219.1 Amycolatopsis lurida_NRPS 71 ELP56284.1 Microcystis aeruginosa_NRPS 64 80 69 KX707867.1 Streptomyces sp._NRPS 73 SCL14475.1 Micromonospora 51 chaiyaphumensis_NRPS 81 69 KX708344.1 Streptomyces lunaelactis_NRPS 75 WP_027943904.1 Micromonospora nigra_NRPS 62 82 67 AB701616.1 Nocardia brasiliensis_NRPS 75 WP_051385598.1 Multispecies_NRPS 78 83 66 CP012150.1 Mycobacterium goodii_thioester 82 WP_014397014.1 Mycobacterium sp._NRPS 86 reductase 84 72 LT629775.1 Streptomyces sp._NRPS 77 WP_011554413.1 Myxococcus fulvus_NRPS 37 85 70 CP000113.1 Myxococcus xanthus_NRPS 72 ALD82526.1 Myxococcus xanthus_NRPS 63 86 68 KT067736.1 Nannocystis sp._nannocystin NRPS 83 BAT54534.1 Nannocystis sp._NRPS 78 87 69 KX707969.1 Streptomyces sp._NRPS 77 AGQ47107.1 Nostoc sp._NRPS 49 88 73 CP002830 Myxococcus fulvus_NRPS 77 AGQ47107.1 Nostoc sp._NRPS 65 89 73 CP020350.1 Pectobacterium carotovorum_NRPS 77 AGQ47121.1 Nostoc sp._NRPS 72 90 67 CP002271.1 Stigmatella aurantiaca_bacitracin 68 WP_007954891.1 Nostoc sp._NRPS 59 NRPS 91 68 JX827856.1 Myxococcus xanthus_myxochromide 70 AAW55337.1 Pelosinus fermentans_NRPS 56 A NRPS 92 70 CP012159.1 Chondromyces crocatus_hypothetical 74 WP_019506242.1 Pleurocapsa sp._NRPS 66 protein 93 69 CP011341.1 Rhodococcus aetherivorans_NRPS 73 OBQ03914.1 Pleurocapsa sp._NRPS 63 252

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Closest sequence match Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession (protein) sim. (%) 94 65 LT827010.1 Actinoplanes sp._NRPS 72 WP_063613608.1 Pseudomonas chlororaphis_NRPS 60 95 74 CP013380.1 Burkholderia sp._NRPS 75 WP_025130130.1 Pseudomonas marginalis_NRPS 75 96 66 CP009434.1 Burkholderia glumae_D-alanine- 69 ELP95414.1 Pseudomonas sp._NRPS 59 poly(phosphoribitol) ligase 97 71 CP011319.1 Janthinobacterium sp._hypothetical 72 CUV18095.1 Pseudomonas syringae_NRPS 65 protein 98 73 CP013140.1 Lysobacter enzymogenes_gramicidin 75 WP_068427237.1 Ralstonia 67 NRPS solanacearum_hypothetical NRPS 99 70 CP020567.1 Streptomyces aureofaciens_NRPS 71 WP_037252570.1 Rhodococcus 61 kyotonensis_hypothetical NRPS 100 70 HE804045.1 Saccharothrix espanaensis_NRPS 77 WP_072805399.1 Rhodococcus rhodnii_NRPS 79 101 66 CP012150.1 Mycobacterium goodii_thioester 79 WP_054188088.1 Rhodococcus sp.ADH_NRPS 67 reductase 102 73 CP015726.1 Streptomyces sp._NRPS 69 EHK80199.1 Rhodococcus 64 yunnanensis_hypothetical NRPS 103 66 CP013358 Burkholderia oklahomensis_NRPS 67 WP_073629376.1 Saccharomonospora 50 azurea_NRPS 104 68 CP003720.1 Streptomyces hygroscopicus_NRPS 77 WP_015803773.1 Pseudomonas syringae pv. 68 Atrofaciens_NRPS 105 73 JX021290.1 Streptomyces 67 WP_015250181.1 Singulisphaera acidiphila_hybrid 60 melaovinaceus_quinocarcin NRPS NRPS/PKS 106 71 LT607733.1 Micromonospora echinofusca_NRPS 67 WP_020466555.1 Singulisphaera acidiphila_NRPS 63 107 72 CP012382.1 Streptomyces ambofaciens_NRPS 77 WP_061609229.1 Sorangium cellulosum_NRPS 58 108 66 CP014168.1 Sphingomonas panacis_hypothetical 67 WP_066825801.1 Sphingomonas mali_hypothetical 57 protein NRPS 109 69 CP011131.1 Lysobacter gummosus_NRPS 74 WP_052069845.1 Streptacidiphilus albus_NRPS 37 253

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 110 71 CP020039.1 Streptomyces sp._NRPS 70 WP_052853332.1 Streptomyces celluloflavus_NRPS 63 111 71 CP001814.1 Streptosporangium roseum_NRPS 77 WP_053924717.1 Streptomyces 74 chattanoogensis_NRPS 112 72 CP003987.1 Streptomyces sp._NRPS 79 WP_053924774.1 Streptomyces 77 chattanoogensis_NRPS 113 72 AF512431.1 Saccharothrix mutabilis_NRPS 76 SED56229.1 Streptomyces 65 melanosporofaciens_NRPS 114 65 CP012687.1 Ralstonia solanacearum_NRPS 78 SFL32914.1 Streptomyces pini_NRPS 47 115 77 AB432412.1 Streptomyces lilaceus_NRPS 76 SER97618.1 Streptomyces qinglanensis_NRPS 70 116 70 AF329398.1 Streptomyces 83 WP_031224174.1 Streptomyces 77 roseochromogenes_clorobiocin NRPS roseochromogenus_NRPS 117 68 CP0171316.1 Streptomyces 81 WP_048480441.1 Streptomyces roseus_NRPS 82 rubrolavendulae_tyrocidine NRPS 118 66 AJ865878.1 Catenulispora sp._NRPS 72 SCK04959.1 Streptomyces sp. NRPS 71 119 66 CP004025.1 Myxococcus stipitatus_NRPS 68 WP_047471668.1 Streptomyces sp._hybrid 100 NRPS/PKS 120 69 LK022848.1 Streptomyces iranensis_NRPS 71 WP_073815332.1 Streptomyces sp._hypothetical 64 hybrid NRPS/PKS 121 68 AB672910.1 Actinoplanes sp._NRPS 75 WP_069737028.1 Streptomyces sp._hypothetical 65 NRPS 122 71 CP019724.1 Streptomyces pactum_hypothetical 92 KJY48006.1 Streptomyces sp._hypothetical 63 protein NRPS 123 68 KX708474.1 Streptomyces sp._NRPS 79 EYU65217.1 Streptomyces sp._hypothetical 94 NRPS 124 67 LC177441.1 Micromonospora sp._NRPS 76 APD71954.1 Streptomyces sp._NRPS 57

254

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Closest sequence match Sequ. Sequ. ASV# %G+C Accession (nucleotide) sim. (%) Accession Closest sequence match (protein) sim. (%) 125 70 CP003777.1 Amycolatopsis 74 WP_052391239.1 Streptomyces sp._NRPS 67 mediterranei_NRPS 126 69 AM746676.1 Sorangium cellulosum_NRPS 71 APD71723.1 Streptomyces sp._NRPS 55 127 68 CP006272.1 Actinoplanes friuliensis_NRPS 76 APD72147.1 Streptomyces sp._NRPS 77 128 72 KC876465.1 Streptomyces sp._NRPS 76 WP_073751644.1 Streptomyces sp._NRPS 80 129 68 CP016001.1 Burkholderia sp._hypothetical 69 SCF97699.1 Streptomyces sp._NRPS 67 protein 130 67 CP016001.1 Burkholderia sp._hypothetical 69 SCG01944.1 Streptomyces sp._NRPS 68 protein 131 73 KX708474.1 Streptomyces sp._NRPS 76 APD72146.1 Streptomyces sp._NRPS 75 132 64 CP019798.1 Streptomyces sp._hypothetical 71 WP_051806912.1 Streptomyces sp._NRPS 70 protein 133 70 AP017900.1 Norcardia seriolae_NRPS 73 WP_031040675.1 Streptomyces sp._NRPS 46 134 73 CP002593.1 Pseudonocardia 70 WP_030750732.1 Streptomyces sp._NRPS 70 dioxanivorans_NRPS 135 71 CP011499.1 Streptomyces incarnatus 79 WP_030160178.1 Streptomyces sp._NRPS 77 strain_hypothetical protein 136 68 CP002299.1 Frankia sp._NRPS 78 WP_018381666.1 Streptomyces vitaminophilus_NRPS 74 137 71 KX622587.1 Cystibacterineae 68 WP_052888712.1 Thermogemmatispora 61 sp._myxochromide D NRPS carboxidivorans_hypothetical NRPS 138 69 CP001032.1 Opitutus terrae_NRPS 73 WP_050045819.1 Tolypothrix bouteillei_NRPS 60 139 65 CP020567.1 Streptomyces 69 WP_038074856.1 Tolypothrix bouteillei_NRPS 53 aureofaciens_NRPS 140 71 LT607803.1 Variovorax sp._NRPS 72 SCK42146.1 Variovorax sp._NRPS 66

255

Appendix Table A1.2 Taxonomic classification of NRPS AD domain sequences when analysed using both nucleotide BLASTn and translated protein sequence BLASTx algorithms cont.

Sequ. Closest sequence match Sequ. ASV# %G+C Accession Closest sequence match (nucleotide) sim. (%) Accession (protein) sim. (%) 141 68 LT629735.1 Opitutus sp._NRPS 71 SCF02615.1 Williamsia sp._NRPS 76 142 72 CP017311.1 Hydrogenophaga sp._4-hydroxyphenylpyruvate 69 ABD96132.1 Wollea ambigua_NRPS 56 dioxygenase 143 68 KF657738.1 Sorangium cellulosum_microsclerodermin NRPS 71 ABD96132.1 Wollea ambigua_NRPS 56

144 66 AB432571.1 Streptomyces niger_NRPS 78 ABD96132.1 Wollea ambigua_NRPS 47

256

APPENDIX 2 (CHAPTER 3)

A2.1 STOCK SOLUTIONS

RAVAN STOCK SOLUTION WITH TRACE SALTS (1X) Per L Reference

Glucose 5 g Peptone 5 g Yeast extract 5 g Watve 2000; Sodium Acetate 5 g Shirling & Tri-sodium citrate 5 g Gottlieb 1966 Pyruvic acid 2 g Trace salts solution 1 mL

Dissolved in sterile Milli-Q water, 0.22µm filter sterilised. Used at 0.05x concentration for culturing.

SOIL EXTRACT STOCK SOLUTION Per L Reference

Antarctic bulk soil (Casey station 500 g Norman 1958 Test Plot 123567)

Soil combined with dH2O. Autoclaved 121°C for 45 min. Filtered through sterile cotton wool. Centrifuged 1753 x g, 20 min. Filtered through Whatman No. 1 filter paper, autoclaved 121°C for 15 min, sealed and stored 4°C.

TRACE SALTS SOLUTION Per L Reference

FeSO4·7H2O 1 g Shirling & MnCl2·4H20 1 g Gottlieb ZnSO4·7H2O 1 g 1966

Dissolved in 100 mL dH2O. Autoclaved 121°C for 15 min. Added 1 mL/L to RAVAN stock solution (1x).

257

WOLFE'S VITAMIN SOLUTION (10X) Per L Reference

Biotin 20mg Wolin et al. Folic acid 20mg 1963; Pyridoxine hydrochloride 100mg Bakermans Riboflavin 50mg et al. 2014 Thiamine 50mg Nicotinic acid 50mg Pantothenic acid 50mg Vitamin B12 1mg p-aminobenzoic acid 50mg Thioctic acid 50mg

0.22µm filter sterilised. Added to media at 50uL/L after autoclaving and cooling.

A2.2 MEDIA

NUTRIENT AGAR (0.75x) (NA) Per L Reference

Nutrient Broth 9.75 g Lapage et (Oxoid) al. 1970 Agar 15 g

Dissolved in dH2O, pH 7, autoclaved 121°C for 15 min.

RAVAN MEDIA (0.05x) Per L Reference (RAVAN/ TSV/ GG)

Gellan gum 7.2 g Watve 2000; MgCl ·7H 0 1 g 2 2 Tanaka et al. Wolfe's Vitamin solution (10x) 50 μL 2014; Wolin et RAVAN stock solution with trace 50 mL al 1963 salts (1x)

Vigorously mixed gellan gum, dH2O water and MgCl2, pH 7. Autoclaved 121°C for 15 min. Added sterile RAVAN stock and vitamin solution once cooled to ~50°C.

258

RAVAN/ TSV BROTH (0.05x) Per L Reference

RAVAN stock solution with trace Watve 50 mL salts (1x) 2000; Wolin et al Wolfe's Vitamin solution (10x) 50 μL 1963

Combined sterile Milli-Q water, RAVAN stock and vitamin solutions.

SOIL EXTRACT GELLAN GUM (SEGG) Per L Reference

Soil extract stock solution 500 mL Norman Gellan gum 7.2 g 1958; Suzuki CaCl ·2H O 1 g 2 2 2001

Vigorously mixed gellan gum, dH2O and CaCl2. Added soil extract, pH 7, adjusted volume to 1 L, autoclaved 121°C for 15 min.

WATER AGAR WITH CYCLOHEXIMIDE (WCX) Per L Reference

CaCl2·2H2O 1 g Agar 15 g Shimkets Cycloheximide* (0.25 200 et al. 2006 mg/mL) mL

Dissolved CaCl2 and agar in dH2O, pH 7, autoclaved 121°C for 15 min. Cooled to ~50°C, added sterile cycloheximide. *toxic- prepared in fume hood.

259

APPENDIX 3 (CHAPTER 4)

A3.1 MEDIA

ISP4 (Inorganic Salts-Starch Agar) Per L Reference

Soluble Starch 10g Shirling & K2HPO4 1g Gottlieb MgSO4·7H2O 1g 1966 NaCl 1g (NH4)2SO4 2g CaCO3 2g Agar 20g Trace salts solution 1mL

All ingredients suspended in dH2O, pH 7, heated to boiling and mixed vigorously. Autoclaved 121°C for 15 min. Gently agitated constantly while pouring plates to maintain uniformity.

260

Appendix Table A3.1 Representative reference genomes chosen for mapping to Antarctic bacterial libraries, with corresponding quality measures determined by CheckM.

REFERENCE GENOME QUALITY Total length Cntg Comp. Contam ANTARCTIC ISOLATE Name and Genbank assembly number (Mb) Cntgs L50 (%) (%) Streptomyces NBSH44 Streptomyces bottropensis GCA_000383595.1 9 4 1 100 0.9 Mesorhizobium NBSH29 Mesorhizobium loti GCA_000384055.1 7.5 5 2 99.9 1.2 Hymenobacter NBH84 Hymenobacter swuensis GCA_000576555.1 5.3 4 1 99.9 0.3 Paracoccus NBH48 Paracoccus zeaxanthinifaciens GCA_000420145.1 3 35 5 98.0 0.0 Leifsonia INR9 Leifsonia xyli GCA_000470775.1 2.7 1 1 98.7 0.5 Streptomyces INR7 Streptomyces lavendulae GCA_000715625.1 6.2 2,008 291 97.6 2.0 Kribbella SPB151 Kribbella flavida GCF_000024345.1 7.6 1 1 100 2.9 Geodermatophilaceae NBWT11 Modestobacter marinus GCA_000306785.1 5.6 1 1 100 1.7 Novosphingobium NBM11 Novosphingobium stygium GCA_900102455.1 4.2 27 4 99.8 0.0 Sphingomonas NBWT7 Sphingomonas mucosissima GCA_002197665.1 3.6 16 2 99.2 0.8 Quadrisphaera INWT6 Quadrisphaera sp. GCA_900101335.1 3.2 4 2 99.4 1.1 Streptomyces NBH77 Streptomyces hygroscopicus GCA_001553435.1 10.1 183 28 100 3.5 Azospirillum INR13 Azospirillum lipoferum GCA_000283655.1 6.8 7 2 100 3.1 Pseudarthrobacter NBSH8 Pseudarthrobacter sulfonivorans GCA_001484605.1 5.1 2 1 99.7 0.9 Frigoribacterium NBH87 Frigoribacterium sp. GCA_000878135.1 3.4 136 16 99.0 0.4 Cryobacterium INWT7 Cryobacterium psychrotolerans GCA_900101115.1 3.2 50 9 99.5 0.0

261

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH.

Isolate Cntg Rgn BGC Type From To Most sim cluster Natural Product % of genes sim MIBiG BGC-ID Streptomyces 1 1 NRPS fragment- 286868 348993 Elloramycin polyketide 12 BGC0000219 INR7 Other 2 T3PKS 370550 411608 Alkylresorcinol polyketide 100 BGC0000282 3 Siderophore 447338 459543 Kedarcidin polyketide 1 BGC0000081 4 Melanin 553133 580860 Istamycin saccharide 4 BGC0000700 5 Terpene 584230 604346 Monensin polyketide 5 BGC0000100 6 Terpene 661269 680721 2-methylisoborneol terpene 100 BGC0000658 7 NRPS 728655 778958 Coelichelin NRPS 100 BGC0000325 8 Terpene-Thiopeptide 782973 833447 - - - - 9 Otherks-T1PKS 1099877 1150566 - - - - 10 NRPS 1162303 1228965 Friulimicin NRPS 18 BGC0000354 11 Terpene 1277535 1304219 Hopene terpene 61 BGC0000663 12 T1PKS 1538950 1585195 Herboxidiene polyketide 11 BGC0001065 13 Butyrolactone- 1593272 1639156 RK-682 polyketide 36 BGC0000140 T1PKS 14 Terpene 1646997 1667094 Geosmin other 100 BGC0001181 15 Bacteriocin 1755478 1764508 - - - - 16 NRPS-Other-T1PKS 1873823 1977840 Himastatin NRPS 48 BGC0001117 17 Siderophore-T2PKS 2056759 2127469 Kinamycin polyketide 40 BGC0000236 18 Arylpolyene 2999561 3040700 Svaricin T1PKS-NRPS 12 BGC0001382 19 Bacteriocin-NRPS 3584707 3627541 - - - - 20 Butyrolactone-NRPS 3944934 3985968 Echosides NRPS 11 BGC0000340 fragment 21 Butyrolactone- 4550092 4635880 Jerangolid polyketide 19 BGC0000080 Ladderane- Phosphonate-T1PKS 22 Siderophore 5063109 5074890 Desferrioxamine B other 83 BGC0000940

262

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH cont.

Natural % of genes MIBiG BGC- Isolate Cntg Rgn BGC Type From To Most sim cluster Product sim ID Streptomyces 1 23 Betalactone-NRPS- 6651979 6724583 Chloramphenicol other 11 BGC0000893 INR7 T1PKS 24 T1PKS 6758511 6803061 A54145 NRPS 3 BGC0000291 25 NRPS 6922334 6989329 Tambromycin NRPS 100 BGC0001368 26 Terpene 7117956 7136860 Avermitilol terpene 100 BGC0000683 27 T2PKS 7309518 7382060 Spore pigment polyketide 66 BGC0000271 28 NRPS 7610016 7697431 Streptothricin NRPS 87 BGC0000432 29 Lanthipeptide 7791151 7813450 SapB RiPP 100 BGC0000551 30 NRPS-Terpene 8039841 8120566 Carotenoid terpene 63 BGC0000633 31 Lanthipeptide 8192303 8214990 Venezuelin RiPP 100 BGC0000563 Streptomyces 1 1 Terpene 207550 231831 Isorenieratene terpene 100 BGC0000664 NBSH44 2 Terpene 528348 554307 Hopene terpene 84 BGC0000663 3 Bacteriocin 1153838 1165166 - - - - 4 NRPS-T1PKS 1194207 1238983 Lysolipin polyketide 4 BGC0000242 5 Melanin 1294356 1304841 Melanin other 100 BGC0000911 6 Siderophore 1564457 1577722 - - - - 7 Terpene 2101686 2121630 - - - - 8 Blactam 2236666 2256050 Carbapenem MM other 51 BGC0000842 4550 9 NRPS-T1PKS 2605002 2657055 Enduracidin NRPS 8 BGC0000341 10 Terpene 3301371 3321625 - - - - 11 Betalactone-NRPS 3348642 3401369 Bacillibactin NRPS 15 BGC0000309 12 Butyrolactone 3998044 4008937 Lactonamycin polyketide 3 BGC0000238 13 Blactam 4532324 4555881 Clavulanic acid other 20 BGC0000845 14 NRPS fragment 4623851 4664473 - - - - 15 Siderophore 5061124 5072899 Desferrioxamine B other 80 BGC0000941 16 NRPS-T1PKS 5249900 5297396 Goadsporin RiPP 12 BGC0000565 17 NRPS 5552402 5605402 Mannopeptimycin NRPS 7 BGC0000388

263

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH cont.

Natural % of genes MIBiG BGC- Isolate Cntg Rgn BGC Type From To Most sim cluster Product sim ID Streptomyces 1 18 NRPS-Otherks 5633232 5710664 Clorobiocin other 10 BGC0000832 NBSH44 19 Ectoine 6322570 6332968 Ectoine other 100 BGC0000853 20 Terpene 6771801 6791559 Steffimycin polyketide 16 BGC0000273 21 Lanthipeptide 7111642 7133324 - - - - 22 Lanthipeptide 7218059 7240725 AmfS RiPP 100 BGC0000496 23 Terpene 7297327 7319549 Geosmin other 100 BGC0001181 2 1 NRPS 3 36985 Thiolutin NRPS 36 BGC0001193 2 Butyrolactone-T1PKS 57233 111937 C-1027 hybrid 42 BGC0000965 3 NRPS 132497 177003 Maduropeptin hybrid 16 BGC0001008 Streptomyces 1 1 Terpene 93417 113757 - - - - NBH77 2 NRPS 200003 258132 Pellasoren hybrid 25 BGC0001034 3 NRPS-NRPS fragment- 271098 480551 Candicidin polyketide 100 BGC0000034 T1PKS 4 T3PKS 481279 522376 Herboxidiene polyketide 12 BGC0001065 5 NRPS-T1PKS-Terpene 578919 639733 Isorenieratene terpene 85 BGC0000664 6 Ectoine 1316160 1326558 Ectoine other 100 BGC0000853 7 Siderophore 2186550 2197079 Desferrioxamine other 100 BGC0000941 B 8 NRPS 2847125 2952068 Mannopeptimycin NRPS 7 BGC0000388 9 NRPS 3171323 3232111 Mannopeptimycin NRPS 14 BGC0000388 10 NRPS 3255692 3303369 Scabichelin NRPS 40 BGC0000423 11 Thiopeptide 4261495 4293979 - - - - 12 NRPS 4315553 4359151 Chloramphenicol other 11 BGC0000893 13 Terpene 4779878 4799990 Albaflavenone terpene 100 BGC0000660 14 Terpene 5108889 5129563 Geosmin other 100 BGC0001181 15 Siderophore 5371274 5386444 - - - - 16 Bacteriocin 5711807 5720919 - - - - 17 Bacteriocin 6082308 6092523 - - - -

264

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH cont.

Natural % of genes MIBiG BGC- Isolate Cntg Rgn BGC Type From To Most sim cluster Product sim ID Streptomyces 1 18 Terpene 6152296 6178960 Hopene terpene 76 BGC0000663 NBH77 19 NRPS-T1PKS 6230946 6279565 SGR PTMs hybrid 100 BGC0001043 20 NRPS fragment-T1PKS 6313255 6360133 Daptomycin NRPS 7 BGC0000336 21 NRPS-Otherks- 6540628 6629986 Leinamycin hybrid 16 BGC0001101 TransatPKS 22 Lanthipeptide 6638025 6661426 - - - - Kribbella 1 1 NRPS 572163 689774 Thiocoraline NRPS 34 BGC0000445 SPB151 2 Lanthipeptide 2383021 2406061 - - - - 3 NRPS 2936473 2985624 - - - - 4 Arylpolyene 3263913 3304977 Avilamycin A polyketide 5 BGC0000026 5 Lanthipeptide 3742846 3765809 - - - - 6 NRPS fragment 5299839 5342542 - - - - 7 NRPS 6462521 6542738 Albachelin NRPS 20 BGC0001211 8 T3PKS 6890371 6931417 Alkylresorcinol polyketide 100 BGC0000282 9 NRPS-T1PKS 6952251 7002793 Asukamycin polyketide 3 BGC0000187 10 Siderophore 7727856 7743032 - - - - Mesorhizobium 1 1 Lassopeptide 259537 281875 Salecan saccharide 12 BGC0001380 x2 INR15_SH29 2 T3PKS 2207871 2248941 - - - - 3 Hserlactone 2750054 2768944 - - - - 4 Bacteriocin 3114119 3125006 - - - - 5 NRPS fragment 3219848 3263945 - - - - 6 Terpene 3951461 3972312 - - - - 7 Hserlactone 6604816 6625454 - - - - 2 1 Betalactone 2290826 2323439 - - - - 2 Bacteriocin 2330151 2341071 - - - - 3 Terpene 2592971 2613795 Surfactin NRPS 8 BGC0000433

265

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH cont.

Natural % of genes MIBiG BGC- Isolate Cntg Rgn BGC Type From To Most sim cluster Product sim ID Mesorhizobium x2 3 1 Hserlactone 241308 261973 - - - - INR15_SH29 4 1 Hserlactone 93080 113709 - - - - 2 T3PKS 312831 353910 - - - - Novosphingobium 1 1 Hserlactone- 708438 741477 - - - - NBM11 Lassopeptide 2 Betalactone 800482 826019 - - - - 3 Terpene 893078 916017 Astaxanthin hybrid 75 BGC0001086 dideoxyglycoside 4 NRPS fragment 2406638 2450738 - - - - 5 Terpene 3501987 3526848 Malleobactin NRPS 14 BGC0000386 2 1 Bacteriocin 861044 871880 - - - - Leifsonia INR9 1 1 Betalactone- 224830 266944 Carotenoid terpene 28 BGC0000636 Terpene 2 NRPS fragment 696644 740603 Meilingmycin polyketide 4 BGC0000093 3 Phosphonate 2316328 2357185 Dehydrophos other 11 BGC0000897 4 T3PKS 2902968 2944068 Alkylresorcinol polyketide 100 BGC0000282 5 Bacteriocin 3864567 3874851 - - - - Pseudarthrobacter 1 1 Siderophore 363535 375409 Desferrioxamine other 80 BGC0000941 B NBSH8 2 NRPS fragment 1452793 1495696 Streptomycin saccharide 16 BGC0000717 3 Betalactone 1527654 1552945 - - - - 4 T3PKS 2108289 2149470 - - - - 5 NRPS fragment 3245338 3287863 Antimycin hybrid 20 BGC0000958 Cryobacterium 1 1 T3PKS 38230 79297 Alkylresorcinol polyketide 100 BGC0000282 INWT7 2 Betalactone 1333898 1361237 - - - - 3 NRPS fragment 2169601 2212558 - - - - 4 Terpene 2298568 2317888 - - - - 5 Terpene 3219678 3240619 Carotenoid terpene 50 BGC0000644

266

Appendix Table A3.2 Biosynthetic gene clusters detected in Antarctic bacterial genomes by antiSMASH cont.

Natural % of MIBiG BGC- Isolate Cntg Rgn BGC Type From To Most sim cluster Product genes sim ID Frigoribacterium 1 1 NRPS fragment 834160 878242 - - - - NBH87 2 Bacteriocin 2069121 2079582 - - - - 3 Terpene 2153844 2174758 Carotenoid terpene 50 BGC0000644 4 Siderophore 2672929 2684257 Desferrioxamine B other 60 BGC0000941 5 T3PKS 2848106 2889359 Tetronasin polyketide 3 BGC0000163 Geodermatophilaceae 1 1 T3PKS 1506594 1547622 Alkyl-O- hybrid 28 BGC0001077 NBWT11 Dihydrogeranyl- Methoxyhydroquinone 2 Terpene 2350784 2370872 Herboxidiene polyketide 2 BGC0001065 3 Terpene 3743987 3764919 Carotenoid terpene 18 BGC0000633 4 Lassopeptide 3776256 3797850 Sioxanthin hybrid 37 BGC0001087 Paracoccus NBH48 1 1 T3PKS 191330 232388 - - - - 2 Terpene 243803 267376 Carotenoid terpene 100 BGC0000635 2 1 Ectoine 451315 461704 Ectoine other 100 BGC0000860 3 1 Hserlactone 585 21214 - - - - Hymenobacter 1 1 Terpene 614672 634914 Carotenoid terpene 71 BGC0000650 NBH84 2 Bacteriocin 1836800 1847672 - - - - 3 T3PKS 3878793 3919905 - - - - 4 Terpene 4381820 4402965 - - - - Quadrisphaera 1 1 T3PKS 64448 104837 Alkylresorcinol polyketide 100 BGC0000282 INWT6 2 Terpene 725933 746898 Carotenoid terpene 18 BGC0000633 2 1 Terpene 425686 446978 Sioxanthin hybrid 100 BGC0001087 Azospirillum INR13 9 1 Betalactone- 186571 230689 Fengycin hybrid 20 BGC0001095 NRPS fragment 10 1 Otherks 101961 147780 Anthracimycin T1PKS 22 BGC0001301 Sphingomonas 1 1 Terpene 2182684 2207019 Astaxanthin hybrid 75 BGC0001086 NBWT7 dideoxyglycoside 2 T3PKS 2696542 2737591 - - - -

267

Appendix Table A3.3 NRPS gene BLASTp and NaPDos analysis of condensation domains.

antiSMASH BGC BLASTp (NRPS gene) NaPDoS (Condensation domain) Gene Id Dom locus Pathway Rgn Description/ Accession Dom Closest Id (%) e-value tag (%) (aa) product Streptomyces NRPS_Streptomyces sp. 7 718 99 C 1192-1476 cdaps3_C1_DCL 41 3E-44 CDA INR7 H036 [WP_053673504.1] C 2625-2914 act3_C1_DCL 35 7E-31 actinomycin NRPS_Streptomyces sp. 10 1093 99 C 1-308 act3_C2_LCL 58 3E-69 actinomycin XY593 [WP_078963433.1] C 1060-1353 act3_C2_LCL 59 6E-67 actinomycin C 2111-2412 act3_C3_LCL 60 5E-84 actinomycin C 3168-3469 act3_C3_LCL 60 8E-88 actinomycin NRPS_multispecies 1099 99 C 27-321 syrin1_C7_LCL 44 3E-59 syringomycin [WP_078942128.1] micro1_C1_ C 1079-1373 39 7E-50 microcystin modAA NRPS_multispecies 1100 100 C 607-897 syrin1_C7_LCL 40 3E-46 syringomycin [WP_053626857.1] NRPS_multispecies 19 3216 99 C 31-331 syrin1_C6_LCL 42 7E-43 syringomycin [WP_053613903.1] NRPS_Streptomyces sp. 25 6188 99 C 47-341 act3_C2_LCL 43 5E-57 actinomycin XY533 [WP_053612853.1] NRPS_multispecies 6189 99 C 46-342 syrin1_C9_LCL 37 3E-31 syringomycin [WP_053612852.1] Hypothetical_multispecies 6190 99 C 44-330 syrin1_C5_LCL 34 1E-22 syringomycin [WP_030657526.1] NRPS_Streptomyces sp. 28 6844 99 C 1-297 act3_C2_LCL 62 8E-89 actinomycin XY593 [WP_078963375.1] C 1061-1361 act2_C2_LCL 59 9E-85 actinomycin

268

Appendix Table A3.3 NRPS gene BLASTp and NaPDos analysis of condensation domains cont.

antiSMASH BGC BLASTp (NRPS gene) NaPDoS (Condensation domain) Gene Id Dom locus Pathway Rgn Description/ Accession Dom Closest Id (%) e-value tag (%) (aa) product Streptomyces NRPS_Streptomyces 28 6845 96 C 8-303 cdaps3_C1_DCL 48 1E-47 CDA INR7 virginiae [WP_030899157.1] C 1036-1325 cyclom1C4_LCL 56 3E-69 cyclomarin C 20752375 act3_C2_LCL 56 4E-83 actinomycin NRPS_multispecies 30 7218 100 C 23-307 ituri1_C3_LCL 28 2E-09 iturin [WP_053626909.1] NRPS_Streptomyces sp. 7219 99 C 1-292 micro5_C1_H 29 2E-23 microcystin XY533 [WP_053611980.1] NDED_Streptomyces sp. 7222 99 C 58-351 tioS_C2_LCL 37 5E-35 thiocoraline XY593 [WP_063787970.1] Streptomyces NRPS_Streptomyces sp. 11 3135 91 C 618-922 cdaps1_C6_LCL 56 2E-78 CDA NBSH44 SM13 [WP_103509423.1] C 2355-2646 act3_C1_DCL 45 5E-61 actinomycin NRPS_Couchioplanes 17 5121 82 C 817-1119 micro2_C1_DCL 39 3E-52 microcystin caeruleus [WP_071803074.1] NRPS_multispecies 5122 81 C 46-346 bleom9_C1_LCL 45 1E-59 bleomycin [WP_078588596.1] NRPS_multispecies 17 5122 81 C 1574-1876 micro2_C1_DCL 41 2E-57 microcystin [WP_078588596.1] aaADP_Streptomyces sp. 18 5183 96 C 1595-1890 act3_C1_DCL 51 8E-75 actinomycin ADI98-10 [WP_124265733.1] NRPS_Streptomyces anulatus 5189 94 C 9-304 act2_C1_start 42 3E-49 actinomycin [WP_033895289.1] Chr2_ hypothetical protein Streptomyces 136 85 C 33-306 Stro2721_1 44 7E-64 sporolide 3 sp. CB02058 [WP_073753584.1]

269

Appendix Table A3.3 NRPS gene BLASTp and NaPDos analysis of condensation domains cont.

antiSMASH BGC BLASTp (NRPS gene) NaPDoS (Condensation domain) Gene Id Dom locus Pathway Rgn Description/ Accession Dom Closest Id (%) e-value tag (%) (aa) product Streptomyces NRPS_Streptomyces griseus 8 2411 99 C 600-891 cdaps1_C7_LCL 44 6E-48 CDA NBH77 [SUP57408.1] C 2099-2399 bleom8_C1_DCL 42 3E-37 bleomycin C 3154-3452 bleom9_C1_LCL 47 3E-52 bleomycin C 4664-4963 ituri2_C5_DCL 39 2E-56 iturin aaADP_Streptomyces griseus 2412 99 C 639-943 cdaps1_C2_LCL 44 7E-47 CDA [WP_115068788.1] C 2165-2469 micro2_C1_DCL 35 5E-48 microcystin C 3215-3503 syrin1_C9_LCL 43 2E-56 syringomycin aaADP_Streptomyces griseus 2413 99 C 45-363 syrin1_C6_LCL 42 1E-40 syringomycin [WP_115068787.1] C 1141-1437 syrin1_C9_LCL 45 4E-56 syringomycin C 2657-2962 micro2_C1_DCL 39 2E-56 microcystin C 3705-3999 syrin1_C9_LCL 42 2E-61 syringomycin C 5203-5505 micro2_C1_DCL 36 1E-51 microcystin C 6252-6552 cdaps3_C2_LCL 44 6E-55 CDA C 1082-1379 cdaps3_C2_LCL 46 2E-46 CDA aaADP_Streptomyces griseus 2414 99 C 45-340 cyclom1C5_LCL 46 2E-51 cyclomarin [WP_115068789.1] C 1082-1379 cdaps3_C2_LCL 46 2E-46 CDA C 2618-2917 ituri3_C2_DCL 37 6E-54 iturin NRPS_Streptomyces sp. 9 2647 99 C 627-936 syrin1_C6_LCL 38 3E-32 syringomycin TSRI0384-2 [WP_100455272.1] C 2158-2460 ituri3_C2_DCL 37 8E-55 iturin

270

Appendix Table A3.3 NRPS gene BLASTp and NaPDos analysis of condensation domains cont.

antiSMASH BGC BLASTp (NRPS gene) NaPDoS (Condensation domain) Gene Id Dom locus Pathway Rgn Description/ Accession Dom Closest Id (%) e-value tag (%) (aa) product Streptomyces aaADP_Streptomyces sp. 9 2648 99 C 45-344 cdaps1_C7_LCL 45 1E-55 CDA NBH77 ADI98-12 [WP_124287396.1] C 1588-1889 ituri3_C2_DCL 40 6E-49 iturin C 2653-2953 syrin1_C7_LCL 42 2E-57 syringomycin NRPS_Streptomyces griseus 10 2703 99 C 11-307 act2_C1_start 38 9E-28 actinomycin [WP_115068909.1] C 1052-1347 prist2_C3_LCL 55 3E-56 pristinamycin C 2113-2414 act3_C3_LCL 58 7E-73 actinomycin aaADP_Streptomyces sp. 12 3655 99 C 604-904 syrin1_C9_LCL 35 9E-30 syringomycin ADI98-12 [WP_124287620.1] Kribbella NRPS_Saccharothrix 1 549 57 C 614-921 syrin1_C7_LCL 41 8E-51 syringomycin SPB151 carnea [PSL53320.1] C 2110-2404 micro2_C1_DCL 39 8E-59 microcystin C 3578-3879 micro2_C1_DCL 38 5E-52 microcystin NRPS_Saccharothrix 550 57 C 41-333 cdaps3_C2_LCL 47 4E-59 CDA carnea [WP_106618241.1] C 1566-1867 micro2_C1_DCL 39 5E-64 microcystin C 2634-2854 syrin1_C9_LCL 49 6E-40 syringomycin NRPS_Amycolatopsis 551 57 C 1037-1340 micro2_C1_DCL 38 7E-54 microcystin orientalis [WP_044850574.1] C 2104-2401 syrin1_C9_LCL 45 7E-55 syringomycin NRPS_Streptomyces 578 49 C 4-315 tioR_C1_start 45 1E-36 thiocoraline lunaelactis [WP_108155207.1] C 1564-1860 tioR_C3_DCL 51 2E-72 thiocoraline C 2609-2911 tioS_C1_LCL 56 4E-90 thiocoraline

271

Appendix Table A3.3 NRPS gene Blastp and NaPDos analysis of condensation domains cont.

antiSMASH BGC BLASTp (NRPS gene) NaPDoS (Condensation domain) Gene Id Dom locus Rgn Description/ Accession Dom Closest Id (%) e-value Pathway product tag (%) (aa) Kribbella NRPS_Streptomyces sp. 73 1 579 63 C 1-301 tioS_C1_LCL 56 2E-92 thiocoraline SPB151 [WP_101418611.1] C 1423-1724 tioS_C2_LCL 65 1E-104 thiocoraline aaADP_Kribbella sp. 3 2794 80 C 428-711 tyroc2_C3_LCL 34 3E-34 tyrocidine NEAU-SW521 [WP_112242521.1] aaADP_Kribbella sp. 2795 75 C 35-313 cyclom1C2_LCL 43 6E-46 cyclomarin NEAU-SW521 [WP_112242523.1] C 1148-1434 syrin1_C9_LCL 44 6E-46 syringomycin aaADP_Kribbella sp. 7 6340 93 C 269-554 syrin1_C6_LCL 41 2E-52 syringomycin NEAU-SW521 [WP_112239288.1] aaADP_Kribbella sp. 6341 91 C 32-327 syrin1_C9_LCL 42 3E-58 syringomycin NEAU-SW521 [WP_112239146.1] micro1_C1_ C 1086-1384 36 1E-35 microcystin modAA micro1_C1_ C 2171-2470 33 1E-29 microcystin modAA aaADP_Kribbella sp. 6350 89 C 10-308 bacil2_C1_start 40 6E-56 bacillibactin NEAU-SW521 [WP_112248827.1] C 1359-1658 cdaps1_C6_LCL 57 3E-91 CDA C 2934-3230 cdaps3_C1_DCL 51 4E-80 CDA aaADP: amino acid adenylation domain-containing protein NDED: NAD-dependent epimerase/dehydratase family protein CDA: Calcium-dependent antibiotic

272

Appendix Table A3.4 Type I PKS gene BLASTp and NaPDos analysis of ketosynthase domains.

antiSMASH BGC BLASTp (Type 1 PKS gene) NaPDoS (Ketosynthase domain) Gene Id Dom locus Id Regn Description/Accession Dom Closest e-value Pathway product tag (%) (aa) (%) Streptomyces T1PKS_Streptomyces sp. ArsA_Azotobacter_ 9 1049 99 KS 670-1110 53 7E-111 alkylresorcinol INR7 H036 [WP_107092636.1] PUFA T1PKS_multispecies PfaA_Shewanella_ 1050 99 KS 15-478 38 1E-76 PUFA [WP_051734545.1] PUFA T1PKS_Streptomyces sp. 12 1417 99 KS 101-517 EpoD_Q9L8C7_4mod 54 3E-105 epothilone XY593 [WP_053685078.1] T1PKS_multispecies 13 1458 100 KS 3-414 EpoD_Q9L8C7_4mod 49 6E-90 epothilone [WP_053625151.1] T1PKS_Streptomyces sp. 21 4100 99 KS 20-445 TetA_BAE93722_KS1 58 9E-120 tetronomycin XY593 [WP_053683944.1] AveA3_Q9S0R4_ KS 1046-1471 70 1E-170 avermectin 2mod T1PKS_multispecies 4103 99 KS 30-455 EpoE_Q9L8C6_1mod 44 1E-77 epothilone [WP_051734506.1] T1PKS_Streptomyces sp. 4104 100 KS 36-459 EpoD_Q9L8C7_4mod 51 1E-115 epothilone XY593 [WP_078963465.1] PKS_Streptomyces sp. 4111 100 KS 39-449 EpoE_Q9L8C6_1mod 55 8E-112 epothilone XY511 [WP_078943949.1] T1PKS_Streptomyces sp. 4112 99 KS 8-428 EpoD_Q9L8C7_4mod 47 2E-99 epothilone XY511 [WP_107087159.1] b-ketoacyl-ACP synthase_ 4114 multispecies 100 KS 140-400 FabF_Bacillus_FAS 41 4E-55 FAS [WP_030649161.1] hypothetical_multispecies 4119 99 KS 74-394 FabF_Bacillus_FAS 35 4E-33 FAS [WP_053625404.1] T1PKS_multispecies PfaA_Shewanella_ 24 6046 99 KS 3-458 38 1E-81 PUFA [WP_030744309.1] PUFA

273

Appendix Table A3.4 Type I PKS gene BLASTp and NaPDos analysis of ketosynthase domains cont.

antiSMASH BGC BLASTp (Type 1 PKS gene) NaPDoS (Ketosynthase domain) Gene Id Dom locus Id Regn Description/Accession Dom Closest e-value Pathway product tag (%) (aa) (%) Streptomyces Chr2_ T1PKS_Streptomyces sp. C1027_AAL06699_ 66 91 KS 3-461 92 0 C-1027 enediyne NBSH44 2 CB02058 [WP_073753628.1] ene9 SDROR_ Streptomyces AveA2_Q9S0R7_ 20 5282 Streptomyces sp. ADI98-12 99 KS 1-368 73 2E-145 avermectin NBH77 2mod [WP_124288063.1] T1PKS_Azospirillum Azospirillum PfaA_Shewanella_ 1 108 lipoferum 93 KS 2-227 42 8E-42 PUFA INR13 PUFA [WP_014188491.1] T1PKS_Azospirillum PfaA_Shewanella_ 111 lipoferum 95 KS 12-418 56 2E-119 PUFA PUFA [WP_014188492.1] SDROR: SDR family oxidoreductase

274

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and NaPDos analysis of condensation and ketosynthase domains.

antiSMASH BGC BLASTp (Hybrid NRPS/Type 1 PKS gene) NaPDoS (Condensation/Ketosynthase domain) Gene Id Dom locus Id Rgn tag Description/ Accession (%) Dom (aa) Closest (%) e-value Pathway product Streptomyces T1PKS_Streptomyces sp. AveA3_Q9S0R4_ 16 1692 84 KS 800-1225 74 1E-175 avermectin INR7 fd1-xmd [WP_078095288.1] 3mod NysC_Q9L4W3_ KS 2576-3002 77 0 nystatin 2mod AveA3_Q9S0R4_ KS 4336-4762 74 3E-178 avermectin 3mod T1PKS_multispecies AveA2_Q9S0R7_ 1693 100 KS 37-464 74 6E-178 avermectin [WP_053626004.1] 1mod NRPS_Streptomyces sp. 1697 100 C 30-323 bleom9_C1_LCL 42 5E-28 bleomycin XY533 [WP_053612455.1] NRPS_Streptomyces sp. 1701 99 C 1-301 cyclom1C2_LCL 57 2E-78 cyclomarin XY511 [WP_078944041.1] C 1571-1869 cdaps3_C1_DCL 48 9E-67 CDA C 3017-3319 act3_C2_LCL 57 9E-82 actinomycin C 4577-4875 tioR_C3_DCL 48 6E-62 thiocoraline T1PKS_Streptomyces sp. 23 5979 99 KS 20-450 EpoC_Q9L8C8_H 52 5E-116 epothilone XY511 [WP_053627656.1] NRPS_Streptomyces sp. 5980 99 C 632-925 micro3_C1_LCL 29 8E-30 microcystin XY511 [WP_053627657.1] b-ketoacyl-ACP synthase_ FabF_Bacillus_ 5997 100 KS 54-389 46 5E-63 FAS multispecies [WP_030660568.1] FAS Streptomyces aaADP_Streptomyces sp. KirAI_CAN89631 4 1134 75 KS 668-1101 43 2E-82 kirromycin NBSH44 ADI97-07 [WP_124280375.1] _2T C 2124-2419 tioS_C2_LCL 38 3E-12 thiocoraline T1PKS_Streptomyces sp. 9 2373 82 KS 5-436 EpoC_Q9L8C8_H 48 1E-100 epothilone [APD71787.1]

275

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and NaPDos analysis of condensation and ketosynthase domains cont.

antiSMASH BGC BLASTp (Hybrid NRPS/Type 1 PKS gene) NaPDoS (Condensation/Ketosynthase domain) Gene Id Dom locus Id e- Regn Description/Accession Dom Closest Pathway product tag (%) (aa) (%) value Streptomyces AmTc3PP_Streptomyces sp. 9 2374 78 C 905-1195 syrin1_C6_LCL 34 4E-36 syringomycin NBSH44 NRRL S-1824 [WP_052189335.1] C 1413-1716 cyclom1C3_LCL 34 3E-30 cyclomarin T1PKS_Rhizobium sp. EpoC_Q9L8C8_ 16 4880 40 KS 624-1040 47 3E-96 epothilone R339 [WP_088678735.1] H C 2078-2339 grami2_C4_LCL 25 2E-11 gramicidin Streptomyces NRPS_Streptomyces sp. 2 198 99 C 3-305 act2_C1_start 36 6E-27 actinomycin NBH77 ADI98-12 [RPK82802.1] C 1070-1371 act3_C3_LCL 50 9E-65 actinomycin C 2104-2399 act3_C1_DCL 45 2E-32 actinomycin C 3129-3429 cdaps2_C3_LCL 42 3E-34 CDA aaADP_Streptomyces griseus KirAIV_CAN896 199 99 KS 29-439 41 2E-61 kirromycin [WP_115069650.1] 34_8T C 812-1108 prist2_C3_LCL 38 1E-42 pristinamycin C 1882-2165 act3_C1_DCL 46 5E-45 actinomycin protein kinase_Streptomyces sp. 3 252 99 C 22-250 bleom9_C1_LCL 28 2E-06 bleomycin ADI98-12 [WP_124288328.1] aaADP_Streptomyces griseus 253 99 C 7-301 bacil2_C1_start 43 7E-69 bacillibactin [WP_115068275.1] C 1069-1369 cyclom1C2_LCL 51 6E-52 cyclomarin ATDP_Streptomyces sp. EpoC_Q9L8C8_ 7E- 254 100 KS 14-443 53 epothilone ADI98-12 [WP_124288326.1] H 108 T1PKS_Streptomyces sp. NysC_Q9L4W3_ 5E- 285 99 KS 666-1093 69 nystatin ADI98-12 [RPK82844.1] 5mod 168 SDROR_Streptomyces NysC_Q9L4W3_ 2E- 289 99 KS 36-460 72 nystatin griseus [WP_115068282.1] 2mod 149

276

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and NaPDos analysis of condensation and ketosynthase domains cont.

antiSMASH BGC BLASTp (Hybrid NRPS/Type 1 PKS gene) NaPDoS (Condensation/Ketosynthase domain) Gene Id Dom locus Id e- Regn Description/Accession Dom Closest Pathway product tag (%) (aa) (%) value Streptomyces SDROR_Streptomyces NysC_Q9L4W3_ 3 289 99 KS 1795-2218 80 0 nystatin NBH77 griseus [WP_115068282.1] 5mod NysC_Q9L4W3_ KS 3539-3962 81 0 nystatin 5mod NysC_Q9L4W3_ KS 5264-5687 79 0 nystatin 5mod NysC_Q9L4W3_ 3E- KS 6975-7390 73 nystatin 5mod 177 NysC_Q9L4W3_ 5E- KS 8842-9254 73 nystatin 5mod 178 SDROR_Streptomyces NysC_Q9L4W3_ 290 99 KS 35-461 74 0E+00 nystatin griseus [WP_115068283.1] 3mod NysI_Q9L4X3_3 2E- KS 1600-2027 68 nystatin mod 176 AveA3_Q9S0R4_ 2E- KS 3705-4132 72 avermectin 3mod 167 SDROR_Streptomyces NysJ_Q9L4X2_3 7E- 291 100 KS 35-457 74 nystatin griseus [WP_115068284.1] mod 178 T1PKS_Streptomyces sp. NysJ_Q9L4X2_1 3E- 292 99 KS 36-461 79 nystatin TSRI0384-2 [WP_100455788.1] mod 178 NysJ_Q9L4X2_2 KS 2074-2499 73 0 nystatin mod NysJ_Q9L4X2_2 8E- KS 4118-4544 72 nystatin mod 175 NysK_Q9L4X1_1 KS 5646-6072 76 0 nystatin mod

277

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and NaPDos analysis of condensation and ketosynthase domains cont.

antiSMASH BGC BLASTp (Hybrid NRPS/Type 1 PKS gene) NaPDoS (Condensation/Ketosynthase domain) Gene Id Dom locus Id e- Regn Description/Accession Dom Closest Pathway product tag (%) (aa) (%) value Streptomyces SDROR_Streptomyces NysI_Q9L4X3_1 3 293 99 KS 35-451 75 0 nystatin NBH77 griseus [WP_115068285.1] mod NysC_Q9L4W3_ KS 1782-2204 82 0 nystatin 5mod NysI_Q9L4X3_3 KS 3342-3764 78 0 nystatin mod NysI_Q9L4X3_4 KS 4873-5298 79 0 nystatin mod NysI_Q9L4X3_5 KS 6429-6848 77 0 nystatin mod NysI_Q9L4X3_6 7E- KS 7947-8368 74 nystatin mod 173 aaADP_Streptomyces sp. micro1_C1_mod 5 427 99 C 433-736 33 2E-32 microcystin ADI98-12 [WP_124288291.1] AA ATDP_Streptomyces sp. 428 99 C 117-408 syrin1_C9_LCL 38 4E-37 syringomycin ADI98-12 [WP_124288289.1] ATDP_Streptomyces sp. EpoC_Q9L8C8_ 4E- 432 99 KS 13-433 50 epothilone ADI98-12 [WP_124288289.1] H 106 aaADP_Streptomyces sp. 433 99 C 1138-1411 mycos1_C3_LCL 27 1E-14 mycosubtilin ADI98-12 [WP_124288288.1] hybrid NRPS/T1PKS_ HSAF_ABL8639 19 5219 Streptomyces sp. TSRI0384-2 99 KS 12-438 78 0 HSAF 1_i [WP_100453753.1] 4E- C 1857-2153 Sare2407_1 70 putative HSAF 113 aaADP_Streptomyces sp. LnmI_AF484556 2E- 21 5487 99 KS 1849-2259 58 leinamycin ADI98-12 [WP_124287764.1] _1T 104

278

Appendix Table A3.5 Hybrid NRPS/Type I PKS gene BLASTp and NaPDos analysis of condensation and ketosynthase domains cont.

antiSMASH BGC BLASTp (Hybrid NRPS/Type 1 PKS gene) NaPDoS (Condensation/Ketosynthase domain) Gene Id Dom locus Id e- Regn Description/Accession Dom Closest Pathway product tag (%) (aa) (%) value Streptomyces aaADP_Streptomyces sp. LnmI_AF484556 9E- 21 5487 99 KS 2510-2924 65 leinamycin NBH77 ADI98-12 [WP_124287764.1] _2T 127 LnmI_AF484556 6E- KS 3717-4133 72 leinamycin _3T 130 SDROR_Streptomyces sp. LnmJ_AF484556 2E- 5488 TSRI0384-2 99 KS 898-1318 69 leinamycin _1T 171 [WP_100454159.1] LnmJ_AF484556 3E- KS 2083-2508 64 leinamycin _2T 157 LnmJ_AF484556 3E- KS 3271-3707 67 leinamycin _3T 178 LnmJ_AF484556 1E- KS 4570-4998 66 leinamycin _4T 147 PK b-ketoacyl ACP synthase_ JamG_AAS98778 5493 99 KS 93-421 42 2E-44 jamaicamide multispecies [WP_100454163.1] _mod aaADP_Kribbella sp. Kribbella EpoC_Q9L8C8_ 9 6779 NEAU-SW521 92 KS 1447-1837 46 4E-89 epothilone SPB151 H [WP_112236566.1] aaADP: amino acid adenylation domain-containing protein ATDP: acyltransferase domain-containing protein CDA: Calcium-dependent antibiotic HSAF: heat stable antifungal factor AmTc3PP: Aminotransferase class III-fold pyridoxal phosphate-dependent enzyme SDROR: SDR family oxidoreductase

279

Appendix Table A3.6 Type II PKS gene BLASTp and NaPDos analysis of ketosynthase domains.

antiSMASH BGC BLASTp (Type II PKS gene) NaPDoS (Ketosynthase domain) Gene Id Id e- Rgn Description/ Accession Dom Dom locus (aa) Closest Pathway product tag (%) (%) value Streptomyces β-ketoacyl-ACP synthase_ 2090689 - actinorh_NP_629237_ 3.00E- 17 817 99 KS 76 actinorhodin INR7 multispecies [WP_030650395.1] 2091954 KSa 151 KS CLF_multispecies 2090689 - AlnM_ACI88862_ 2.00E- 818 100 KS 61 alnumycin [WP_053628071.1] 2091954 KSb 109 β-ketoacyl-ACP synthase_ 7344518 - AlnL_ACI88861_ 9.00E- 27 6551 100 KS 64 alnumycin multispecies [WP_051734969.1] 7345825 KSa 117 PK β-ketoacyl synthase_ 7345822 - SaqB_ACP19354_ 6.00E- 6552 99 KS 54 saquayamycin multispecies [WP_030654459.1] 7347060 KSb 72

280

281