EXPLORING IN EAST ANTARCTIC SOILS

Mukan Ji

A thesis in fulfilment of requirements for the degree of Doctor of Philosophy

School of Biotechnology and Biomolecular Science

Faculty of Science

UNSW Australia

ORIGINALITY sYfrEMENT

'I hereby declare that this submission is my own work and to the best of my knowledge it contains no·materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the of my own work, except to the extent that assistance from others in the project's design and conception or in style, presenta · and linguistic expression is acknowledged.'

Signed

Date COPYRIGHT STATEMENT

'I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media. now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of

:::

Date

AUTHENTICITY STATEMENT

'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.

Date .....

C' 1 Table of contents

2

3 Table of contents ...... i

4 Abstract ...... v

5 Acknowledgement ...... vii

6 List of Tables ...... viii

7 List of Figures ...... ix

8 List of Appendices ...... xii

9 List of abbreviations ...... xiii

10 Chapter 1 General introduction ...... 1

11 1.1 Terrestrial Antarctica harbours abundant and diverse microbial communities ... 1

12 1.2 Cold adaption strategies of Antarctic ...... 6

13 1.3 Functional capacities of Antarctic soil bacteria ...... 8

14 1.4 The role of biotic and abiotic factors in shaping Antarctic soil microbial

15 communities ...... 11

16 1.5 Next generation sequencing technologies and expanding our understanding of

17 microbial diversity and function ...... 12

18 1.6 The Windmill Islands, an under-explored ice free region ...... 18

19 1.7 Aims ...... 21

20 Chapter 2 Microbial diversity at Mitchell Peninsula, East Antarctica: a potential

21 microbial dark matter “hotspot” ...... 22

22 2.1 INTRODUCTION ...... 22

23 2.2 MATERIAL AND METHODS ...... 23

24 2.2.1 Soil sampling ...... 23

25 2.2.2 Physicochemical analyses ...... 23 i 1 2.2.3 DNA extraction...... 24

2 2.2.4 Multiplex 454 pyrosequencing and data processing ...... 24

2.2.5 Quantitative PCR (qPCR) targeting bacterial SSU RNA and fungal SSU 3 2.2.5 Quantitative PCR (qPCR) targeting bacterial SSU RNA and fungal SSU

4 RNA genes ...... 27

5 2.2.6 Statistical analysis ...... 27

6 2.2.7 Network analysis ...... 28

7 2.3 RESULTS ...... 29

8 2.3.1 Soil properties ...... 29

9 2.3.2 Soil microbial abundance and diversity ...... 30

10 2.3.3 Microbial community similarity across Mitchell Peninsula ...... 33

2.3.4 Network analysis of bacterial and fungal communities and measured 11 2.3.4 Network analysis of bacterial and fungal communities and measured

12 environmental parameters ...... 40

13 2.4 DISCUSSION ...... 42

C hapter 3 Atmospheric chemotrophy: A unique functional capacity of East Antarctic 14 Chapter 3 Atmospheric chemotrophy: A unique functional capacity of East Antarctic

15 microbial communities ...... 45

16 3.1 Introduction ...... 45

17 3.2 Material and methods ...... 46

18 3.2.1 Site description ...... 46

19 3.2.2 Metagenome sequencing and assembly ...... 49

20 3.2.3 Population genome binning and assessment ...... 49

21 3.2.4 Metabolic annotation ...... 50

22 3.2.5 Microbial community profile ...... 50

23 3.2.6 Phylogenetic inference ...... 50

24 3.2.7 Comparative genomics analysis ...... 51

3.2.8 Gene cluster and phylogenetic analysis of RubisCO, hydrogenase and 25 3.2.8 Gene cluster and phylogenetic analysis of RubisCO, hydrogenase and

26 ammonia monooxygenase subunit C (amoC) genes ...... 51

27 3.3 Results ...... 52

ii

1 3.3.1 The Robinson Ridge metagenome ...... 52

2 3.3.2 Recovery of draft genomes from Robinson Ridge soil ...... 55

3.3.3 Energy and nutrient dynamics inferred from the Robinson Ridge 3 3.3.3 Energy and nutrient dynamics inferred from the Robinson Ridge

4 metagenome ...... 59

5 3.4 Discussion ...... 70

6 Chapter 4 Characterisation of candidate phylum WPS-2 ...... 73

7 4.1 Introduction ...... 73

8 4.2 Materials and methods ...... 74

9 4.2.1 Genome analysis ...... 74

10 4.2.2 Probe design and optimisation...... 74

11 4.2.3 Construction of a WPS-2 specific clone ...... 75

12 4.2.4 Clone-FISH ...... 78

13 4.2.5 WPS-2 specific Clone-FISH optimisation ...... 78

14 4.2.6 Epi-fluorescence microscopy ...... 79

15 4.2.7 Extraction of cells from soil using Nycodenz ...... 79

16 4.3 Results ...... 79

17 4.3.1 Genomic analysis of WPS-2...... 79

18 4.3.2 In silico probe validation ...... 84

19 4.3.3 WPS-2 specific probe design and optimisation using clone-FISH...... 84

20 4.3.4 FISH on Antarctic soil bacteria ...... 85

21 4.4 Discussion ...... 86

22 Chapter 5 The phylogeny and environmental drivers of WPS-2 ...... 88

23 5.1 Introduction ...... 88

24 5.2 Materials and methods ...... 89

25 5.2.1 WPS-2 sequence retrieval and alignment ...... 89

26 5.2.2 Phylogenetic construction ...... 90

27 5.2.3 Metagenomic data retrieval ...... 90

iii

1 5.2.4 Environmental parameter transformation and normalisation ...... 92

2 5.2.5 Transformation of WPS-2 OTU abundance matrix ...... 92

3 5.2.6 Correlation analysis between WPS-2 OTUs and environmental factors 92

4 5.2.7 Dominant WPS-2 OTUs and soil sample selection ...... 92

5 5.2.8 Identifying key environmental parameters driving WPS-2 distribution 93

6 5.2.9 Co-occurrence network analysis ...... 93

7 5.3 Results ...... 94

8 5.3.1 Phylogeny of WPS-2...... 94

9 5.3.2 Environmental correlations to WPS-2 abundance ...... 101

10 5.4 Discussion ...... 106

11 Chapter 6 Discussion and conclusions ...... 108

12 Atmospheric hydrogen gas as an exclusive energy source for carbon fixation .... 108

Fo rm IE RubisCO is abundant is Windmill Islands, and as potentially 13 Form IE RubisCO is abundant is Windmill Islands, and Actinobacteria as potentially

14 CO2 sink in Antarctica ...... 111

15 Nitrogen balancing in the Windmill Islands polar deserts ...... 112

16 Requirement of SSU gene standardisation among databases ...... 113

17 Proposed functional capacities of candidate phyla WPS-2 and AD3 ...... 114

18 Future research ...... 115

19 Conclusions...... 116

20 References ...... 118

21 Appendices ...... 154

22

23

iv

1 Abstract

A ntarctic desert soil ecosystems are predominately comprised of , which 2 Antarctic desert soil ecosystems are predominately comprised of prokaryotes, which

have developed unique ecological functions to cope with the extreme environmental 3 have developed unique ecological functions to cope with the extreme environmental

conditions experienced in Antarctica. While Antarctic soils have exhibited diverse 4 conditions experienced in Antarctica. While Antarctic soils have exhibited diverse

m icrobial community structures, including the presence of rare bacterial lineages, the 5 microbial community structures, including the presence of rare bacterial lineages, the

ecological functions or genomic capacities of the microbial dark matter in this 6 ecological functions or genomic capacities of the microbial dark matter in this

environment has remained largely unexplored. 7 environment has remained largely unexplored.

8

M itchell Peninsula and Robinson Ridge are polar desert sites located in the Windmill 9 Mitchell Peninsula and Robinson Ridge are polar desert sites located in the Windmill

Islands, East Antarctica, which are very low in carbon and nitrogen. Here, PCR 10 Islands, East Antarctica, which are very low in carbon and nitrogen. Here, PCR

am plicon 454 pyrosequencing targeting the bacterial SSU rRNA genes revealed both 11 amplicon 454 pyrosequencing targeting the bacterial SSU rRNA genes revealed both

sites to encompass a microbial community “hotspot” comprised of a high relative 12 sites to encompass a microbial community “hotspot” comprised of a high relative

abundance of candidate phyla WPS-2 (9.3%) and AD3 (5.1%), as well as uncultured 13 abundance of candidate phyla WPS-2 (9.3%) and AD3 (5.1%), as well as uncultured

C hloroflexi and Actinobacteria. In addition, the abundance of , the 14 and Actinobacteria. In addition, the abundance of Cyanobacteria, the

primary carbon and nitrogen fixer in many environments, including Antarctica, was 15 primary carbon and nitrogen fixer in many environments, including Antarctica, was

extremely low (average 0.35%). 16 extremely low (average 0.35%).

17

Sho tgun metagenomics and differential coverage binning was used to recover 23 draft 18 Shotgun metagenomics and differential coverage binning was used to recover 23 draft

genom es from Robinson Ridge, including for the first time, two candidate division 19 genomes from Robinson Ridge, including for the first time, two candidate division

W PS-2 and three AD3 draft genomes. While Cyanobacteria abundance was confirmed 20 WPS-2 and three AD3 draft genomes. While Cyanobacteria abundance was confirmed

to be low, metagenomic analysis revealed that 45% of the draft genomes recovered 21 to be low, metagenomic analysis revealed that 45% of the draft genomes recovered

w ere carrying a novel type IE RuBisCO, indicative of carbon fixation. With no 22 were carrying a novel type IE RuBisCO, indicative of carbon fixation. With no

bacterial chlorophyll or rhodopsin identified, a dark carbon fixation process reliant on 23 bacterial chlorophyll or rhodopsin identified, a dark carbon fixation process reliant on

and CO was discovered. This process occurs through 24 the oxidation of atmospheric H2 and CO was discovered. This process occurs through the use of the novel 1E RuBisCO, as well as specialised high affinity type 1h/5 25 the use of the novel 1E RuBisCO, as well as specialised high affinity type 1h/5

[N iFe]-hydrogenases and carbon monoxide dehydrogenases, all of which were also 26 [NiFe]-hydrogenases and carbon monoxide dehydrogenases, all of which were also

v

w idely distributed in the Robinson Ridge metagenome. In contrast to the major carbon 1 widely distributed in the Robinson Ridge metagenome. In contrast to the major carbon

acquisition pathway identified, no genes were present, yet 2 acquisition pathway identified, no nitrogen fixation genes were present, yet

denitrification capacity was widely detected in the draft genomes. As denitrification 3 denitrification capacity was widely detected in the draft genomes. As denitrification

w ould lead to the loss of nitrogen from the ecosystem, the balance of nitrogen in 4 would lead to the loss of nitrogen from the ecosystem, the balance of nitrogen in

W indmill Islands region needs further investigation. 5 Windmill Islands region needs further investigation.

6

A n inconsistency in the classification of WPS-2 was identified across the major SSU 7 An inconsistency in the classification of WPS-2 was identified across the major SSU

rRN A gene databases. WPS-2 in Greengenes was comprised of two : 8 rRNA gene databases. WPS-2 in Greengenes was comprised of two bacterial phyla:

W PS-2 and SHA-109, while WPS-2 in RDP was actually sequences from 9 WPS-2 and SHA-109, while WPS-2 in RDP was actually sequences from

P lanctomycetes. Phylogenetic analysis revealed WPS-2 to be comprised of two 10 . Phylogenetic analysis revealed WPS-2 to be comprised of two

sub-clusters that has a close relationship to phylum Chloroflexi. I propose these 11 sub-clusters that has a close relationship to phylum Chloroflexi. I propose these

sub-clusters to be composed of either facultative autotrophs (sub-cluster I) or 12 sub-clusters to be composed of either facultative autotrophs (sub-cluster I) or

13 obligative heterotrophs (sub-cluster II). Genomic inference and fluorescence in situ

hybridisation showed in Antarctic soils, WPS-2 bacteria exist as small diderm cocci, 14 hybridisation showed in Antarctic soils, WPS-2 bacteria exist as small diderm cocci,

m. 15 with diameters ranging between 0.6-1.2 m. 16

H ere, I propose that the microbial community in the extremely carbon-limited soils of 17 Here, I propose that the microbial community in the extremely carbon-limited soils of

M itchell Peninsula and Robinson Ridge have developed a unique dark carbon fixation 18 Mitchell Peninsula and Robinson Ridge have developed a unique dark carbon fixation

process based on the use of a highly dependable fuel source: atmospheric gases. Using 19 process based on the use of a highly dependable fuel source: atmospheric gases. Using

and CO as energy sources, provides a huge advantage for microbes surviving 20 trace H2 and CO as energy sources, provides a huge advantage for microbes surviving in the extreme polar desert environments of East Antarctica. I believe that in this 21 in the extreme polar desert environments of East Antarctica. I believe that in this

ecosystem, atmospheric carbon fixation, which is distinct from phototrophy or 22 ecosystem, atmospheric carbon fixation, which is distinct from phototrophy or

geothermal chemotrophy is actually a novel primary production strategy supporting 23 geothermal chemotrophy is actually a novel primary production strategy supporting

life in this harsh environment. Gas chromatography and isotope labelling experiments 24 life in this harsh environment. Gas chromatography and isotope labelling experiments

are required to confirm this hypothesis and confirmation will change our 25 are required to confirm this hypothesis and confirmation will change our

understanding on the nutritional limits required to sustain life. 26 understanding on the nutritional limits required to sustain life.

27

vi

1 Acknowledgement

First and formost, I would like to thank my supervisor Dr Belinda Ferrari, for taking 2 First and formost, I would like to thank my supervisor Dr Belinda Ferrari, for taking

m e as her PhD student, for her supports and guidance throughout my candidature. 3 me as her PhD student, for her supports and guidance throughout my candidature.

T hen I would like to thank my parents, for their endless emotional and financial 4 Then I would like to thank my parents, for their endless emotional and financial

supports, as well as my wife who have taken over most of the family duties and takes 5 supports, as well as my wife who have taken over most of the family duties and takes

good care of our daughter Sunny. I would like to thank everyone in the lab, for their 6 good care of our daughter Sunny. I would like to thank everyone in the lab, for their

helps, supports, friendships and laughter. Especially Tristrom and Josie, when I started 7 helps, supports, friendships and laughter. Especially Tristrom and Josie, when I started

m y master degree, they taught me the very basic techniques, as well as 8 my master degree, they taught me the very basic microbiology techniques, as well as

pyrosequencing which my PhD is based on. I would like to thank Australian Antarctic 9 pyrosequencing which my PhD is based on. I would like to thank Australian Antarctic

D ivision, who provided soil samples with such amazing microbial diversity, and 10 Division, who provided soil samples with such amazing microbial diversity, and

Professor Phil Hugenholtz and his team who did the metagenomic sequencing for free 11 Professor Phil Hugenholtz and his team who did the metagenomic sequencing for free

and provided assistance in genome analysis. Special thanks to Dr Lewis Adler from 12 and provided assistance in genome analysis. Special thanks to Dr Lewis Adler from

B MS F, whom I have known for five years. He is a great friend, full of ideas and 13 BMSF, whom I have known for five years. He is a great friend, full of ideas and

always drops by even just to say Hi. Lastly, I would like to thank the chinese lunch 14 always drops by even just to say Hi. Lastly, I would like to thank the chinese lunch

group, I had great fun and it is a nice break from everyday PhD works, especially 15 group, I had great fun and it is a nice break from everyday PhD works, especially

T ammy, who kindly organise it every day. 16 Tammy, who kindly organise it every day.

17

vii

1 List of Tables

T able 2.1 Alpha diversity indices of bacterial and fungal diversity at Mitchell Peninsula 2 Table 2.1 Alpha diversity indices of bacterial and fungal diversity at Mitchell Peninsula

and different sampling groups ······························································ 36 3 and different sampling groups ······························································ 36

T able 3.1 Measured environmental parameters in Robinson Ridge surface soils ······ 46 4 Table 3.1 Measured environmental parameters in Robinson Ridge surface soils ······ 46

T able 3.2 Robinson Ridge metagenome sequence statistic ······························· 52 5 Table 3.2 Robinson Ridge metagenome sequence statistic ······························· 52

T able 3.3 Summary of near-complete draft genomes recovered from Robinson Ridge 6 Table 3.3 Summary of near-complete draft genomes recovered from Robinson Ridge

m etagenome ···················································································· 56 7 metagenome ···················································································· 56

T able 4.1 PCR primers and probe used in this study ······································ 76 8 Table 4.1 PCR primers and probe used in this study ······································ 76

T able 4.2 Genome statistics of WPS-2 ······················································ 80 9 Table 4.2 Genome statistics of WPS-2 ······················································ 80

T able 5.1 Taxonomic classifications of sequences designated as WPS-2 or WD272 ·· 94 10 Table 5.1 Taxonomic classifications of sequences designated as WPS-2 or WD272 ·· 94

T able 5.2 Similarity between reference WPS-2 sequences and observed clusters 11 Table 5.2 Similarity between reference WPS-2 sequences and observed clusters

following phylogenetic analysis ····························································· 96 12 following phylogenetic analysis ····························································· 96

13

viii

1 List of Figures

Figure 1.1 Ice-free areas and research stations present in Antarctica ······················ 3 2 Figure 1.1 Ice-free areas and research stations present in Antarctica ······················ 3

Figure 1.2 Illustrative representation of the possible endolithic and hypolithic habitats 3 Figure 1.2 Illustrative representation of the possible endolithic and hypolithic habitats

of ··············································································· 6 4 of microorganisms ··············································································· 6

Figure 1.3 Representation of nitrogen cycling in soils ······································ 9 5 Figure 1.3 Representation of nitrogen cycling in soils ······································ 9

Figure 1.4 The most recent tree of life described from NGS data ······················· 14 6 Figure 1.4 The most recent tree of life described from NGS data ······················· 14

Figure 1.5 Workflow and overview for recovering population genomes from shotgun 7 Figure 1.5 Workflow and overview for recovering population genomes from shotgun

m etagenomics data ············································································ 16 8 metagenomics data ············································································ 16

Figure 1.6 Map of Windmill Islands showing the five major peninsulas, the Clark, 9 Figure 1.6 Map of Windmill Islands showing the five major peninsulas, the Clark,

B ailey, Mitchell and Browning Peninsulas and Robinson Ridge ························ 19 10 Bailey, Mitchell and Browning Peninsulas and Robinson Ridge ························ 19

Figure 2.1 Map of Mitchell Peninsula, Windmill Islands, East Antarctica ············· 25 11 Figure 2.1 Map of Mitchell Peninsula, Windmill Islands, East Antarctica ············· 25

Figure 2.2 Principal coordinate analysis (PCoA) based on Euclidean distance 12 Figure 2.2 Principal coordinate analysis (PCoA) based on Euclidean distance

calculated from normalized environmental parameters ··································· 28 13 calculated from normalized environmental parameters ··································· 28

Figure 2.3 Bacterial diversity at Mitchell Peninsula ······································· 30 14 Figure 2.3 Bacterial diversity at Mitchell Peninsula ······································· 30

Figure 2.4 The different distribution of major bacterial classes among the three sample 15 Figure 2.4 The different distribution of major bacterial classes among the three sample

groups ··························································································· 31 16 groups ··························································································· 31

Figure 2.5 Fungal diversity at Mitchell Peninsula ········································· 34 17 Figure 2.5 Fungal diversity at Mitchell Peninsula ········································· 34

Figure 2.6 The different distribution of major fungal classes among the three sample 18 Figure 2.6 The different distribution of major fungal classes among the three sample

groups ··························································································· 35 19 groups ··························································································· 35

Figure 2.7 Abundance of bacterial and fungal communities······························ 36 20 Figure 2.7 Abundance of bacterial and fungal communities······························ 36

Figure 2.8 Principal coordinate analysis based on distance calculated from (A) bacterial 21 Figure 2.8 Principal coordinate analysis based on distance calculated from (A) bacterial

SS U RNA gene OTU abundance matrix or (B) fungal ITS gene abundance matrix··· 37 22 SSU RNA gene OTU abundance matrix or (B) fungal ITS gene abundance matrix··· 37

Figure 2.9 Network analysis of 80% cutoff based on OTU correlation analysis ······· 40 23 Figure 2.9 Network analysis of 80% cutoff based on OTU correlation analysis ······· 40

Figure 3.1 Map of Robinson Ridge and Mitchell Peninsula in Windmill Islands, East 24 Figure 3.1 Map of Robinson Ridge and Mitchell Peninsula in Windmill Islands, East

A ntarctica······················································································· 47 25 Antarctica······················································································· 47

Figure 3.2 Eukaryotic and prokaryotic compositions of Robinson Ridge soils26 Figure 3.2 Eukaryotic and prokaryotic taxonomy compositions of Robinson Ridge soils26

ix

··································································································· 53 1 ··································································································· 53

Figure 3.3 Phylogenetic tree demonstrating the phylogenetic relationship between 2 Figure 3.3 Phylogenetic tree demonstrating the phylogenetic relationship between recovered candidate division bacteria WPS-2 and AD3 inferred from multiple marker 3 recovered candidate division bacteria WPS-2 and AD3 inferred from multiple marker genes ···························································································· 55 4 genes ···························································································· 55

Figure 3.4 comparison of relative abundance of KO gene categories between Robinson 5 Figure 3.4 comparison of relative abundance of KO gene categories between Robinson

R idge and other soil ecosystems ····························································· 59 6 Ridge and other soil ecosystems ····························································· 59

Figure 3.5 The gene arrangement structures for RuBisCO gene cluster predicted draft 7 Figure 3.5 The gene arrangement structures for RuBisCO gene cluster predicted draft genom es ························································································ 64 8 genomes ························································································ 64

Figure 3.6 Phylogenetic structure of type IE RuBisCO large subunit amino acid 9 Figure 3.6 Phylogenetic structure of type IE RuBisCO large subunit amino acid sequences identified from the Robinson Ridge ············································ 65 10 sequences identified from the Robinson Ridge ············································ 65

Figure 3.7 Maximum likelihood phylogenetic tree showing the type 1h taxonomy 11 Figure 3.7 Maximum likelihood phylogenetic tree showing the type 1h taxonomy classification of Robinson Ridge hydrogenase sequences ································ 67 12 classification of Robinson Ridge hydrogenase sequences ································ 67

Figure 4.1 The structure of the pET24a-WPS2SSU plasmid ····························· 75 13 Figure 4.1 The structure of the pET24a-WPS2SSU plasmid ····························· 75

Figure 4.2 Overview of WPS-2 functional potential inferred from two draft genomes 80 14 Figure 4.2 Overview of WPS-2 functional potential inferred from two draft genomes 80

carrying WPS-2 16S SSU 15 Figure 4.3 Clone FISH for FISH optimisation using E. coli carrying WPS-2 16S SSU gene plasmid ··················································································· 84 16 gene plasmid ··················································································· 84

Figure 4.4 FISH following hybridisation with Eub-338i-FAM and WPS2-289-Cy3 ·· 85 17 Figure 4.4 FISH following hybridisation with Eub-338i-FAM and WPS2-289-Cy3 ·· 85

Figure 5.1 Map of the Antarctic and High Arctic locations studied ····················· 90 18 Figure 5.1 Map of the Antarctic and High Arctic locations studied ····················· 90

Figure 5.2 Phylogenetic of SSU RNA gene sequences associated with WPS-2 · 94 19 Figure 5.2 Phylogenetic trees of SSU RNA gene sequences associated with WPS-2 · 94

Figure 5.3 Refined WPS-2 phylogeny using sequences classified as WD272 or WPS-220 Figure 5.3 Refined WPS-2 phylogeny using sequences classified as WD272 or WPS-220

··································································································· 98 21 ··································································································· 98

Figure 5.4 Two insertions identified in cluster I WPS-2 SSU RNA gene sequences · 100 22 Figure 5.4 Two insertions identified in cluster I WPS-2 SSU RNA gene sequences · 100

Figure 5.5 Relative abundance of WPS-2 in polar soils·································· 101 23 Figure 5.5 Relative abundance of WPS-2 in polar soils·································· 101

Figure 5.6 Significant geophysical and chemical parameters that correlated to WPS-2 24 Figure 5.6 Significant geophysical and chemical parameters that correlated to WPS-2 relative abundances ·········································································· 101 25 relative abundances ·········································································· 101

Figure 5.7 The distribution of dominant WPS-2 OTUs and the environmental 26 Figure 5.7 The distribution of dominant WPS-2 OTUs and the environmental parameters that best explained their distributions········································· 103 27 parameters that best explained their distributions········································· 103

x

Figure 5.8 Network analysis of dominant WPS-2 OTUs to environmental parameters 1 Figure 5.8 Network analysis of dominant WPS-2 OTUs to environmental parameters

and bacterial OTUs ··········································································· 104 2 and bacterial OTUs ··········································································· 104

3

xi

1 List of Appendices

A ppendix 2.1 DistLM results showing environmental parameters that significantly 2 Appendix 2.1 DistLM results showing environmental parameters that significantly

<0.05) explained the bacterial and fungal distributions ...... 154 3 (p<0.05) explained the bacterial and fungal distributions ...... 154

A ppendix 3.1 Distribution of Robinson Ridge genes by KEGG categories ·········· .156 4 Appendix 3.1 Distribution of Robinson Ridge genes by KEGG categories ·········· .156

A ppendix 3.2 Hierarchical cluster analysis ofthe Robinson Ridge against17 publicly 5 Appendix 3.2 Hierarchical cluster analysis ofthe Robinson Ridge against17 publicly

available soil metagenomes including those from grassland, desert, Arctic tundra and 6 available soil metagenomes including those from grassland, desert, Arctic tundra and

volcanic soils ·················································································· 157 7 volcanic soils ·················································································· 157

A ppendix 3.3 Key genes identified in the draft genomes involved in carbon fixation, 8 Appendix 3.3 Key genes identified in the draft genomes involved in carbon fixation,

and carbon, nitrogen, sulphur cycling ······················································ 158 9 and carbon, nitrogen, sulphur cycling ······················································ 158

A ppendix 3.4 Protein coding sequences of 3-hydroxypropionate carbon fixation 10 Appendix 3.4 Protein coding sequences of 3-hydroxypropionate carbon fixation

pathway within the recovered thaumarchaeota draft genome (bin 19) ················· 151 11 pathway within the recovered thaumarchaeota draft genome (bin 19) ················· 151

A ppendix 3.5 KEGG map of CBB carbon fixation pathway ···························· 164 12 Appendix 3.5 KEGG map of CBB carbon fixation pathway ···························· 164

A ppendix 3.6 KEGG map of nitrogen metabolism ······································· 165 13 Appendix 3.6 KEGG map of nitrogen metabolism ······································· 165

A ppendix 3.7 taxonomy of methane monooxygenase identified from Robinson Ridge 14 Appendix 3.7 taxonomy of methane monooxygenase identified from Robinson Ridge

m etagenome based on BLAST ······························································ 166 15 metagenome based on BLAST ······························································ 166

A ppendix 4.1 Sequence origin of predicted ORF within WPS-2 draft genomes bin22 16 Appendix 4.1 Sequence origin of predicted ORF within WPS-2 draft genomes bin22

and bin23 ······················································································ 170 17 and bin23 ······················································································ 170

A ppendix 4.2 predicted glycoside within WPS-2 draft genomes bin22 and 18 Appendix 4.2 predicted glycoside hydrolases within WPS-2 draft genomes bin22 and

bin23 using dbCAN search ·································································· 171 19 bin23 using dbCAN search ·································································· 171

A ppendix 4.3 Comparison of cell envelop biosynthesis related genes between recovered 20 Appendix 4.3 Comparison of cell envelop biosynthesis related genes between recovered

W PS-2 genomes and a range of bacteria and ···································· 174 21 WPS-2 genomes and a range of bacteria and archaea ···································· 174

A ppendix 5.1 Transformation of environmental parameters to reduce skewness of data22 Appendix 5.1 Transformation of environmental parameters to reduce skewness of data22

·································································································· 175 23 ·································································································· 175

A ppendix 5.2 Environmental parameters that best explained the distribution of 24 Appendix 5.2 Environmental parameters that best explained the distribution of

dom inant WPS-2 OTUS in Arctic and Eastern Antarctica based on DistLM analysis .25 dominant WPS-2 OTUS in Arctic and Eastern Antarctica based on DistLM analysis .25

·································································································· 176 26 ·································································································· 176

xii

1 List of abbreviations

ANOSIM Analysis of similarity ARISA Automated ribosomal intergenic spacer analysis BLAST Basic local alignment search tool CARD-FISH catalysed reported deposition–fluorescence in situ hybridization CBB cycle Calvin-Benson-Bassham cycle Clone-FISH Fluorescence in situ hybridization of 16S rRNA gene clones CPR candidate phyla radiation db-RDA Distance-based redundancy analysis DGGE Denaturing gradient gel electrophoresis DistLM distance-based linear modelling DMB dry matter basis DNA Deoxyribonucleic acid E. coli Escherichia coli FISH Fluorescence in situ hybridisation GTR models generalized time-reversible models

H2 Molecular Hydrogen LB Lysogeny broth MDM Microbial dark matter NAST Nearest Alignment Space Termination NCBI National center for biotechnology information NGS Next generation sequencing OTU Operational taxonomic unit PCO Principal coordinate analysis PCR Polymerase chain reaction PFA Paraformaldehyde PKS Polyketide synthase NRPS Nonribosomal peptide synthetases qPCR Quantitative PCR SDS Sodium dodecyl sulfate SSU RNA Small subunit ribosomal RNA TCA Tricarboxylic acid TCA cycle VBNC Viable but non-unculturable state WPS-2 Candidate phylum WPS-2

2

xiii

1 Chapter 1 General introduction

A ntarctica is the Earth's most southernmost continent. With only 0.3-0.4% of the 2 Antarctica is the Earth's most southernmost continent. With only 0.3-0.4% of the

continent ice-free (Fox et al., 1994, Cary et al., 2010), Antarctica has one of the most 3 continent ice-free (Fox et al., 1994, Cary et al., 2010), Antarctica has one of the most

extreme environments on Earth. Organisms living in terrestrial Antarctica are subjected 4 extreme environments on Earth. Organisms living in terrestrial Antarctica are subjected

to very low temperatures, limited carbon, nitrogen and water availability, strong UV 5 to very low temperatures, limited carbon, nitrogen and water availability, strong UV

radiation and frequent freeze-thaw cycles (Vincent et al., 2012, Convey et al., 2014). 6 radiation and frequent freeze-thaw cycles (Vincent et al., 2012, Convey et al., 2014).

in both 7 Despite this, active bacterial growth has been reported throughout the continent, in both

m arine and terrestrial environments (Grzesiak et al., 2009, Doyle et al., 2013, Schwartz 8 marine and terrestrial environments (Grzesiak et al., 2009, Doyle et al., 2013, Schwartz

et al., 2014). 9 et al., 2014).

10

M icrobial diversity in terrestrial Antarctica has been previously considered lower than 11 Microbial diversity in terrestrial Antarctica has been previously considered lower than

that of temperate regions (Yergeau et al., 2007). In addition, due to the extreme 12 that of temperate regions (Yergeau et al., 2007). In addition, due to the extreme

environmental conditions and limited carbon and nitrogen, Antarctic ecosystems only 13 environmental conditions and limited carbon and nitrogen, Antarctic ecosystems only

support limited trophic levels consisting of , bacteria, archaea, fungi and lower 14 support limited trophic levels consisting of viruses, bacteria, archaea, fungi and lower

invertebrates with a general absence of insect and mammalian herbivores (Heal et al., 15 invertebrates with a general absence of insect and mammalian herbivores (Heal et al.,

1987, Yergeau et al., 2009). However, with the advances in next generation sequencing 16 1987, Yergeau et al., 2009). However, with the advances in next generation sequencing

techniques, a ‘surprisingly’ diverse community of bacteria, Archaea and fungi within 17 techniques, a ‘surprisingly’ diverse community of bacteria, Archaea and fungi within

A ntarctic soils has been revealed (Pointing et al., 2009; Kim, et al., 2014; Tindall 2004; 18 Antarctic soils has been revealed (Pointing et al., 2009; Kim, et al., 2014; Tindall 2004;

Teixeira et al., 2010). Furthermore, similar levels of both richness and phylogenetic 19 Teixeira et al., 2010). Furthermore, similar levels of both richness and phylogenetic

diversity to non-polar biomes have also been reported in Antarctica (Chu et al., 2010). 20 diversity to non-polar biomes have also been reported in Antarctica (Chu et al., 2010).

21 1.1 Terrestrial Antarctica harbours abundant and

22 diverse microbial communities

T he majority of Antarctica’s terrestrial biodiversity is concentrated in the relatively 23 The majority of Antarctica’s terrestrial biodiversity is concentrated in the relatively

sm all and fragmented ice-free areas (Terauds et al., 2012). To date, the majority of 24 small and fragmented ice-free areas (Terauds et al., 2012). To date, the majority of

1 research stations are present in the Antarctic Peninsula or maritime Antarctica (Figure 1). 1 research stations are present in the Antarctic Peninsula or maritime Antarctica (Figure 1).

D ue to the presence of the largest ice-free areas in Antarctica present in South Victoria 2 Due to the presence of the largest ice-free areas in Antarctica present in South Victoria land (Ross Sea and McMurdo Dry Valley regions) and the South Shetland Islands 3 land (Ross Sea and McMurdo Dry Valley regions) and the South Shetland Islands

(Figure 1.1), studies have also been predominately focused on these 4 (Figure 1.1), microbial ecology studies have also been predominately focused on these regions (Sutherland 2009, Mataloni et al., 2010, Schwartz et al., 2014, Rampelotto et al., 5 regions (Sutherland 2009, Mataloni et al., 2010, Schwartz et al., 2014, Rampelotto et al.,

2015). 6 2015).

7

T he Antarctic dry desert mineral soils are the most common soil type, and are highly 8 The Antarctic dry desert mineral soils are the most common soil type, and are highly diverse in the pH (varies from acidic, neutral to alkaline)(Azmi et al., 1997, Aislabie et 9 diverse in the pH (varies from acidic, neutral to alkaline)(Azmi et al., 1997, Aislabie et al., 2004, Shravage et al., 2007, Kim et al., 2012, Siciliano et al., 2014) and water 10 al., 2004, Shravage et al., 2007, Kim et al., 2012, Siciliano et al., 2014) and water content (0.2% to >10.5%), with the water content levels notably decreasing as the 11 content (0.2% to >10.5%), with the water content levels notably decreasing as the distance from the coastline or a lake increases (Cowan et al., 2002, Aislabie et al., 2008, 12 distance from the coastline or a lake increases (Cowan et al., 2002, Aislabie et al., 2008,

L ee et al., 2012). But in general, they are low in carbon (<2%) and nitrogen (<0.1%) 13 Lee et al., 2012). But in general, they are low in carbon (<2%) and nitrogen (<0.1%) and phosphorous (<0.1%) (Niederberger et al., 2008, Wood et al., 2008, Pan et al., 2013, 14 and phosphorous (<0.1%) (Niederberger et al., 2008, Wood et al., 2008, Pan et al., 2013,

S iciliano et al., 2014). Antarctic desert soils are dominated by the bacterial phyla 15 Siciliano et al., 2014). Antarctic desert soils are dominated by the bacterial phyla

Proteobacteria, , Actinobacteria, , , and 16 , Acidobacteria, Actinobacteria, Bacteroidetes, Firmicutes, and

G emm atimonadetes that also dominate temperate soils (Bakermans et al., 2014; Cary et 17 that also dominate temperate soils (Bakermans et al., 2014; Cary et al., 2010; Chong et al., 2011; Kim et al., 2012; Pearce et al., 2012; Roesch et al., 2012; 18 al., 2010; Chong et al., 2011; Kim et al., 2012; Pearce et al., 2012; Roesch et al., 2012;

T eixeira et al., 2010; Van Horn et al., 2013; Varin et al., 2012). In addition, high 19 Teixeira et al., 2010; Van Horn et al., 2013; Varin et al., 2012). In addition, high abundance (>1%) of candidate phyla such as Saccharibacteria (TM7), WPS-2, AD3, 20 abundance (>1%) of candidate phyla such as Saccharibacteria (TM7), WPS-2, AD3,

O P1, OP8, OP11 have been identified at relative abundance above 0.5% (Zeglin et al., 21 OP1, OP8, OP11 have been identified at relative abundance above 0.5% (Zeglin et al.,

2011, Van Horn et al., 2013, Siciliano et al., 2014). 22 2011, Van Horn et al., 2013, Siciliano et al., 2014). 23

2

1

2 . 3 Figure 1.1 Ice-free areas and research stations present in Antarctica.

C losed circles indicate permanent research stations. Open circles represent facilities 4 Closed circles indicate permanent research stations. Open circles represent facilities open during the summer season only. Shaded areas represent ice-free regions; different 5 open during the summer season only. Shaded areas represent ice-free regions; different colours denote the Antarctic Conservation Biogeographic Region. The regions that have 6 colours denote the Antarctic Conservation Biogeographic Region. The regions that have received extensive biodiversity studies are highlighted: South Victoria land is red; the 7 received extensive biodiversity studies are highlighted: South Victoria land is red; the

So uth Shetland Islands region is blue. (Adapted using data retrieved from Antarctic 8 South Shetland Islands region is blue. (Adapted using data retrieved from Antarctic

) 9 Digital Database Map Viewer http://www.add.scar.org/home/add7)

10

3

In comparison to desert soils, ornithogenic soils are formed as a result of bird activities, 1 In comparison to desert soils, ornithogenic soils are formed as a result of bird activities, especially penguin rookery (Speir et al., 1984, Aislabie et al., 2009). Consequently, 2 especially penguin rookery (Speir et al., 1984, Aislabie et al., 2009). Consequently, ornithogenic soils are acidic and high in carbon, nitrogen and phosphorus; at least 3 ornithogenic soils are acidic and high in carbon, nitrogen and phosphorus; at least

10-1000 fold higher than that of the mineral soils (Aislabie et al., 2008, Aislabie et al., 4 10-1000 fold higher than that of the mineral soils (Aislabie et al., 2008, Aislabie et al.,

2009). As a result, ornithogenic soils contain distinctly different bacterial communities 5 2009). As a result, ornithogenic soils contain distinctly different bacterial communities than that present in Antarctic desert soils, with an extremely high abundance of 6 than that present in Antarctic desert soils, with an extremely high abundance of

Firmicutes or Bacteroidetes reported (up to 83.5% Firmicutes was identified from Cape 7 Firmicutes or Bacteroidetes reported (up to 83.5% Firmicutes was identified from Cape

H allett) (Aislabie et al., 2008, Teixeira et al., 2010, Kim et al., 2012). Furthermore, the 8 Hallett) (Aislabie et al., 2008, Teixeira et al., 2010, Kim et al., 2012). Furthermore, the total abundance of bacteria measured in small subunit ribosomal RNA (SSU RNA) 9 total abundance of bacteria measured in small subunit ribosomal RNA (SSU RNA)

copies of SSU RNA gene/g) (Ma et al., 2013) is several magnitudes 10 copy number (1012 copies of SSU RNA gene/g) (Ma et al., 2013) is several magnitudes

SSU RNA gene copies/g of 11 higher than Antarctic desert soils (ranged between 107- 109 SSU RNA gene copies/g of dry soil) (Nichols et al., 1999, Magalhães et al., 2014, Ji et al., 2015, Kudinova et al., 12 dry soil) (Nichols et al., 1999, Magalhães et al., 2014, Ji et al., 2015, Kudinova et al.,

2015). 13 2015).

14

G eothermal soils are a special type of desert soil that are present in the active volcanoes 15 Geothermal soils are a special type of desert soil that are present in the active volcanoes in Victoria land, such as Mount Erebus, Mount Melbourne (Lyon et al., 1974), Mount 16 in Victoria land, such as Mount Erebus, Mount Melbourne (Lyon et al., 1974), Mount

R ittmann and Deception Island (Logan et al., 2000). The soil temperature ranges 17 Rittmann and Deception Island (Logan et al., 2000). The soil temperature ranges between 50-75°C due to geothermal activity. As observed in Antarctic desert soils, the 18 between 50-75°C due to geothermal activity. As observed in Antarctic desert soils, the concentration of carbon, nitrogen and phosphorous is low, combined with high 19 concentration of carbon, nitrogen and phosphorous is low, combined with high concentrations of heavy metals such as copper, zinc, cadmium, lead and mercury 20 concentrations of heavy metals such as copper, zinc, cadmium, lead and mercury

(Logan et al., 2000), which incur further environmental stress on the microorganisms 21 (Logan et al., 2000), which incur further environmental stress on the microorganisms present. Thermal-tolerant Actinobacteria and Firmicutes have been identified in 22 present. Thermal-tolerant Actinobacteria and Firmicutes have been identified in

, 23 geothermal soils of Mt. Melbourne (Bargagli et al., 2004), while Geobacillus, ,

and uncultured sulphate-reducing bacteria have been isolated 24 Brevibacillus, Thermus and uncultured sulphate-reducing bacteria have been isolated from Deception Island (Muñoz et al., 2011). 25 from Deception Island (Muñoz et al., 2011).

26

27

4

C ryptoendolithic (endolithic) microbial communities are described as microorganisms 1 Cryptoendolithic (endolithic) microbial communities are described as microorganisms

(bacteria, fungi and algae) growing inside of translucent rocks (Broady, 1981). In 2 (bacteria, fungi and algae) growing inside of translucent rocks (Broady, 1981). In contrast, hypolithic communities exist on the underside and around the margins of 3 contrast, hypolithic communities exist on the underside and around the margins of translucent rocks (Friedmann, 1982, Cary et al., 2010, Wei et al., 2015) (Figure 1.2). 4 translucent rocks (Friedmann, 1982, Cary et al., 2010, Wei et al., 2015) (Figure 1.2).

T he rocks provide shelter against strong winds and fluctuating temperatures, allowing 5 The rocks provide shelter against strong winds and fluctuating temperatures, allowing sunlight to pass through for photosynthetic bacteria and algae to function (Friedmann, 6 sunlight to pass through for photosynthetic bacteria and algae to function (Friedmann,

1982, Cary et al., 2010). Based on the rock niche that microorganisms occupy, 7 1982, Cary et al., 2010). Based on the rock niche that microorganisms occupy, endolithic organisms are further classified into chasmoendoliths, those colonizing rock 8 endolithic organisms are further classified into chasmoendoliths, those colonizing rock fissures and cracks; cryptoendoliths, those occupying structural cavities within the rock; 9 fissures and cracks; cryptoendoliths, those occupying structural cavities within the rock; and Euendoliths (hypoendolithis), those actively penetrating the substratum (Figure 1.2) 10 and Euendoliths (hypoendolithis), those actively penetrating the substratum (Figure 1.2)

(W ei et al., 2015, Zucconi et al., 2016). Different microbial community compositions 11 (Wei et al., 2015, Zucconi et al., 2016). Different microbial community compositions have been reported in the three types of endolithic communities described. However, 12 have been reported in the three types of endolithic communities described. However, they are all dominated by Cyanobacteria, with Actinobacteria, Proteobacteria, 13 they are all dominated by Cyanobacteria, with Actinobacteria, Proteobacteria,

C hloroflexi also commonly identified (Pointing et al., 2009, Cowan et al., 2011, 14 Chloroflexi also commonly identified (Pointing et al., 2009, Cowan et al., 2011,

M akhalanyane et al., 2013). Due to their carbon and nitrogen fixation capacity, 15 Makhalanyane et al., 2013). Due to their carbon and nitrogen fixation capacity,

C yanobacteria are believed to be the primary producers within the hypolithic 16 Cyanobacteria are believed to be the primary producers within the hypolithic com munities while also providing an import source of organic nitrogen to support 17 communities while also providing an import source of organic nitrogen to support m icrobial community of surrounding soils (Cowan et al., 2011). 18 microbial community of surrounding soils (Cowan et al., 2011).

19

5

1 2 Figure 1.2 Illustrative representation of the possible endolithic and hypolithic

3 habitats of microorganisms.

are identified in Antarctic terrestrial 4 Two general types of lithobiontic habitats are identified in Antarctic terrestrial

ecosystems: hypolithic (underneath rocks), endolithic (inside rocks). The endolithic 5 ecosystems: hypolithic (underneath rocks), endolithic (inside rocks). The endolithic

com munity is further divided into cryptoendolithic, chasmoendolithic and 6 community is further divided into cryptoendolithic, chasmoendolithic and

M odified from Wierzchos et al., 2013). 7 hypoendolithic (euendolithic) environments. (Modified from Wierzchos et al., 2013).

8 1.2 Cold adaption strategies of Antarctic bacteria

B acteria have developed various strategies to tolerate the extreme environment of 9 Bacteria have developed various strategies to tolerate the extreme environment of

A ntarctica, including the ability to survive frequent freeze-thaw cycles, high salinity, 10 Antarctica, including the ability to survive frequent freeze-thaw cycles, high salinity,

strong UV radiation and desiccation (Roesch et al., 2012, Tahon et al., 2016). Bacteria 11 strong UV radiation and desiccation (Roesch et al., 2012, Tahon et al., 2016). Bacteria

isolated from Antarctic soil and ice cores are frequently spore-forming and/or capable of 12 isolated from Antarctic soil and ice cores are frequently spore-forming and/or capable of

forming thick cell walls and extracellular polysaccharides, which increases desiccation 13 forming thick cell walls and extracellular polysaccharides, which increases desiccation

and freeze-thaw resistance (Knowles et al., 2008). Pigmentation is also commonly 14 and freeze-thaw resistance (Knowles et al., 2008). Pigmentation is also commonly

observed in Antarctic bacteria (Volkman et al., 1988, Busse et al., 2003) and 15 observed in Antarctic bacteria (Volkman et al., 1988, Busse et al., 2003) and

experimental evidence has shown that carotenoids (a class of pigmentation) can 16 experimental evidence has shown that carotenoids (a class of pigmentation) can

significantly increase survival rates following exposure to repeated freeze-thaw cycles 17 significantly increase survival rates following exposure to repeated freeze-thaw cycles

and UV radiation (Dieser et al., 2010). 18 and UV radiation (Dieser et al., 2010). 6

C old tolerant bacteria are capable of increasing membrane fluidity, thereby tolerating 1 Cold tolerant bacteria are capable of increasing membrane fluidity, thereby tolerating freezing conditions, by converting saturated fatty acids to unsaturated fatty acids within 2 freezing conditions, by converting saturated fatty acids to unsaturated fatty acids within the membrane (An et al., 2013). Alternatively, they can synthesis branch-chain fatty 3 the membrane (An et al., 2013). Alternatively, they can synthesis branch-chain fatty acids, reduce the size and charge of lipid head groups and/or incorporate polar 4 acids, reduce the size and charge of lipid head groups and/or incorporate polar carotenoids into the bacterial membrane (Chattopadhyay et al., 2001, De Maayer et al., 5 carotenoids into the bacterial membrane (Chattopadhyay et al., 2001, De Maayer et al.,

2014). 6 2014).

7

W hen temperatures drop below the freezing point of a organism’s cytoplasm, ice crystal 8 When temperatures drop below the freezing point of a organism’s cytoplasm, ice crystal formation within a cell leads to cellular damage and osmotic imbalances (De Maayer et 9 formation within a cell leads to cellular damage and osmotic imbalances (De Maayer et al., 2014). Under freezing, desiccation, carbon starvation or high salt stress, bacteria are 10 al., 2014). Under freezing, desiccation, carbon starvation or high salt stress, bacteria are capable of synthesising compatible solutes, which is a group of low molecular weight, 11 capable of synthesising compatible solutes, which is a group of low molecular weight, highly soluble organic molecules that can be accumulated at high quantities within a 12 highly soluble organic molecules that can be accumulated at high quantities within a cell (Klähn et al., 2011). These compounds include sucrose, trehalose, glucosylglycerol, 13 cell (Klähn et al., 2011). These compounds include sucrose, trehalose, glucosylglycerol, polyhydroxybutyrate and glycine betaine, all of which lower the cytoplasmic freezing 14 polyhydroxybutyrate and glycine betaine, all of which lower the cytoplasmic freezing point of the cellular cytoplasm, increase osmotic pressure to protect cells against 15 point of the cellular cytoplasm, increase osmotic pressure to protect cells against desiccation, while also serving as carbon storage compounds (Erdmann 1983, Welsh et 16 desiccation, while also serving as carbon storage compounds (Erdmann 1983, Welsh et al., 1999, Mothes et al., 2008). 17 al., 1999, Mothes et al., 2008).

18

M any Antarctic soil bacteria are in a viable but non-unculturable state (VBNC) 19 Many Antarctic soil bacteria are in a viable but non-unculturable state (VBNC)

(Chattopadhyay 2000, Oliver 2005), due to the limited nutrient, oxygen or low 20 (Chattopadhyay 2000, Oliver 2005), due to the limited nutrient, oxygen or low temperatures. Microscopy examination has shown up to 90% of the bacteria extracted 21 temperatures. Microscopy examination has shown up to 90% of the bacteria extracted

m , dwarf bacteria), which is a 22 from Antarctic soils are small in size (diameter <0.2 m, dwarf bacteria), which is a indication of VBNC (Kudinova et al., 2015). During VBNC, bacteria are dormant and 23 indication of VBNC (Kudinova et al., 2015). During VBNC, bacteria are dormant and not actively growing. Instead, they have a reduced metabolic activity and 24 not actively growing. Instead, they have a reduced metabolic activity and m acromolecular synthesis (Smith et al., 1994, Su et al., 2013). However, unlike dead 25 macromolecular synthesis (Smith et al., 1994, Su et al., 2013). However, unlike dead cells, during dormancy, the cellular membrane integrity is maintained and cellular ATP 26 cells, during dormancy, the cellular membrane integrity is maintained and cellular ATP level remains high (Oliver 2005, Greening et al., 2015b). In addition, the energy 27 level remains high (Oliver 2005, Greening et al., 2015b). In addition, the energy

7

required to maintain dormancy and sense environmental change is proposed to be 1 required to maintain dormancy and sense environmental change is proposed to be

) oxidation through membrane-bound high 2 acquired from atmospheric hydrogen gas (H2) oxidation through membrane-bound high affinity [NiFe]-hydrogenases (Morita, 1999, Greening et al., 2014a, Greening et al., 3 affinity [NiFe]-hydrogenases (Morita, 1999, Greening et al., 2014a, Greening et al.,

4 2015b).

5 1.3 Functional capacities of Antarctic soil bacteria

6 Autotrophic carbon acquisition

In general, Antarctic desert soils are carbon-limited and autotrophic carbon fixation is 7 In general, Antarctic desert soils are carbon-limited and autotrophic carbon fixation is

an important source of carbon influx for organisms inhabiting these environments 8 an important source of carbon influx for organisms inhabiting these environments

(Yergeau et al., 2007). Autotrophic bacteria with capacity for the 9 (Yergeau et al., 2007). Autotrophic bacteria with capacity for the

C alvin-Benson-Bassham (CBB) carbon fixation pathway have been identified in soils 10 Calvin-Benson-Bassham (CBB) carbon fixation pathway have been identified in soils

across Antarctica such as Sør Rondane Mountains (Tahon et al., 2016), McMurdo Dry 11 across Antarctica such as Sør Rondane Mountains (Tahon et al., 2016), McMurdo Dry

V alleys (Chan et al., 2013) and Mt. Erebus (Tebo et al., 2015) and are reported to be 12 Valleys (Chan et al., 2013) and Mt. Erebus (Tebo et al., 2015) and are reported to be

negatively correlated with soil organic carbon. Generally, Cyanobacteria are believed to 13 negatively correlated with soil organic carbon. Generally, Cyanobacteria are believed to

predom inantly carry out phototrophic carbon fixation (Chan et al., 2013, Tebo et al., 14 predominantly carry out phototrophic carbon fixation (Chan et al., 2013, Tebo et al.,

2015) and are frequently identified from microbial mats (Vincent et al., 1993, Taton et 15 2015) and are frequently identified from microbial mats (Vincent et al., 1993, Taton et

al., 2003, Varin et al., 2012), hypolithic and endolithic communities (Wood et al., 2008, 16 al., 2003, Varin et al., 2012), hypolithic and endolithic communities (Wood et al., 2008,

C owan et al., 2011, Lee et al., 2012), while Actinobacteria and Proteobacteria carry out 17 Cowan et al., 2011, Lee et al., 2012), while Actinobacteria and Proteobacteria carry out

chem olithoautotrophic or dark-carbon fixation (Varin et al., 2012, Chan et al., 2013) and 18 chemolithoautotrophic or dark-carbon fixation (Varin et al., 2012, Chan et al., 2013) and

are predominately identified in soils. Additionally, associated with alternative 19 are predominately identified in soils. Additionally, enzymes associated with alternative

carbon fixation pathways such as propionyl-CoA/acetyl-CoA carboxylase (in 20 carbon fixation pathways such as propionyl-CoA/acetyl-CoA carboxylase (in

3-hydroxypropionate/malyl-CoA cycle), carbon monoxide dehydrogenase and ATP 21 3-hydroxypropionate/malyl-CoA cycle), carbon monoxide dehydrogenase and ATP

citrate (Reductive Tricarboxylic Acid Cycle) have also been identified in certain 22 citrate lyase (Reductive Tricarboxylic Acid Cycle) have also been identified in certain

A ntarctic soils (Chan et al., 2013). 23 Antarctic soils (Chan et al., 2013).

24 Nitrogen cycling

N itrogen is extremely limited in Antarctic dry desert soils (<0.1%) (Pan et al., 2013) and 25 Nitrogen is extremely limited in Antarctic dry desert soils (<0.1%) (Pan et al., 2013) and

has been reported to be predominantly replenished by volcanic activities (Oppenheimer 26 has been reported to be predominantly replenished by volcanic activities (Oppenheimer

8

et al., 2005), snowmelt (Burkins et al., 2000) and glacier melts (Howard-Williams et al., 1 et al., 2005), snowmelt (Burkins et al., 2000) and glacier melts (Howard-Williams et al.,

1989). Though nitrogen-containing compounds are limited, the complete nitrogen cycle 2 1989). Though nitrogen-containing compounds are limited, the complete nitrogen cycle

(nitrogen fixation, denitrification, nitrification and ammonia mineralization process) 3 (nitrogen fixation, denitrification, nitrification and ammonia mineralization process)

have been predicted for Antarctic desert soils, but the presence of certain pathways 4 have been predicted for Antarctic desert soils, but the presence of certain pathways

varies among sites. For example, nitrogen fixation capacity based on the presence of 5 varies among sites. For example, nitrogen fixation capacity based on the presence of

gene (Figure 1.3) was not detected in open soil from the McMurdo Dry Valleys 6 nifH gene (Figure 1.3) was not detected in open soil from the McMurdo Dry Valleys

(Cow an et al., 2011), but was detected, albeit in limited levels, in hypolithic 7 (Cowan et al., 2011), but was detected, albeit in limited levels, in hypolithic

com munities and soils covered with vegetation (Yergeau et al., 2007). In contrast, 8 communities and soils covered with vegetation (Yergeau et al., 2007). In contrast,

denitrification capacity has been widely identified in Antarctic soils and biological 9 denitrification capacity has been widely identified in Antarctic soils and biological

10 emission of N2O, mostly likely as a result of denitrification has been reported in situ 11 (Gregorich et al., 2006). Genes that are indicative of denitrifications (nirS, nirK, norB,

have also been detected in Antarctic soils, but are less abundant in 12 nosZ (Figure 1.3), have also been detected in Antarctic soils, but are less abundant in

high altitude or low latitude regions such as Fossil Bluff or Coal Nunatak (Yergeau et al., 13 high altitude or low latitude regions such as Fossil Bluff or Coal Nunatak (Yergeau et al.,

2007, Jung et al., 2011, Chan et al., 2013). 14 2007, Jung et al., 2011, Chan et al., 2013).

15

9

1 2 Figure 1.3 Representation of nitrogen cycling in soils.

G reen pathways denote denitrification, blue denote nitrogen fixation, orange denotes 3 Green pathways denote denitrification, blue denote nitrogen fixation, orange denotes

nitrification and purple denotes the anammox pathway. All pathways are present at 4 nitrification and purple denotes the anammox pathway. All pathways are present at

variable levels across in Antarctic soils (Reproduced from Crane, 2016). 5 variable levels across in Antarctic soils (Reproduced from Crane, 2016).

6

7 Antibiotic resistance and production capacities

A ntibiotic resistance genes have been identified in both pristine and human-impacted 8 Antibiotic resistance genes have been identified in both pristine and human-impacted

A ntarctic soils (Miller et al., 2009, Durso et al., 2012, Segawa et al., 2013). Though 9 Antarctic soils (Miller et al., 2009, Durso et al., 2012, Segawa et al., 2013). Though

identified at lower relative abundances compared with temperate regions (17.6% vs 10 identified at lower relative abundances compared with temperate regions (17.6% vs

21.5% ), beta-lactamase, multi-drug resistance efflux pumps and fluoroquinolone 11 21.5%), beta-lactamase, multi-drug resistance efflux pumps and fluoroquinolone

resistance genes are widely distributed across many bacteria phyla (Acidobacteria, 12 resistance genes are widely distributed across many bacteria phyla (Acidobacteria,

A ctinobacteria, Bacteroidetes, Chlorobi, Crenarchaeota, Cyanobacteria, Euryarchaeota, 13 Actinobacteria, Bacteroidetes, Chlorobi, Crenarchaeota, Cyanobacteria, Euryarchaeota,

Firmicutes, Planctomycetes, Proteobacteria). In contrast, genes for tetracycline and 14 Firmicutes, Planctomycetes, Proteobacteria). In contrast, genes for tetracycline and

vancom ycin resistance were distributed in fewer bacterial lineages such as 15 vancomycin resistance were distributed in fewer bacterial lineages such as

A ctinobacteria, Deinococcus-Thermus and Firmicutes (Durso et al., 2012). In addition 16 Actinobacteria, Deinococcus-Thermus and Firmicutes (Durso et al., 2012). In addition

to antibiotic resistance capacity, type-I polyketide synthases (PKS) and non-ribosomal 17 to antibiotic resistance capacity, type-I polyketide synthases (PKS) and non-ribosomal

peptide synthetase (NRPS) genes, which are involved in bioactive secondary metabolite 18 peptide synthetase (NRPS) genes, which are involved in bioactive secondary metabolite

10

synthesis have been identified from Antarctic coastal sediments (Zhao et al., 2008). 1 synthesis have been identified from Antarctic coastal sediments (Zhao et al., 2008).

2 Moreover, Actinobacterial strains isolated from Antarctic soils (Arthrobacter,

) have shown the capacity to 3 Rhodococcus, Stremptomyces and Micromonospora) have shown the capacity to

produce bioactive compounds with both antibacterial and antifungal effects (Gesheva 4 produce bioactive compounds with both antibacterial and antifungal effects (Gesheva

2010, Benaud 2014). 5 2010, Benaud 2014).

6 1.4 The role of biotic and abiotic factors in shaping

7 Antarctic soil microbial communities

It has long been recognized that the differences observed among microbial community 8 It has long been recognized that the differences observed among microbial community

structures are influenced by both environmental heterogeneity and spatial distance 9 structures are influenced by both environmental heterogeneity and spatial distance

(Espinosa-Garcia et al., 1990, Martiny et al., 2006, Ramette et al., 2007). Due to the 10 (Espinosa-Garcia et al., 1990, Martiny et al., 2006, Ramette et al., 2007). Due to the

simplicity of the Antarctic trophic systems, Antarctic soils provide an excellent 11 simplicity of the Antarctic trophic systems, Antarctic soils provide an excellent

opportunity to investigate the correlation between the microbial community assembly 12 opportunity to investigate the correlation between the microbial community assembly

and external drivers (Yergeau et al., 2009). Globally, soil microbial community diversity 13 and external drivers (Yergeau et al., 2009). Globally, soil microbial community diversity

(m easured by unifrac) and richness are primarily driven by pH at the continental scale, 14 (measured by unifrac) and richness are primarily driven by pH at the continental scale,

w ith the highest diversity was observed in soils with a neutral pH (Fierer et al., 2006, 15 with the highest diversity was observed in soils with a neutral pH (Fierer et al., 2006,

L auber et al., 2009, Rousk et al., 2010). While across the poles, pH is also strongly 16 Lauber et al., 2009, Rousk et al., 2010). While across the poles, pH is also strongly

associated with the community composition. However, species richness is primarily 17 associated with the community composition. However, species richness is primarily

influenced by soil fertility (defined as organic matter, nitrogen and chloride content) 18 influenced by soil fertility (defined as organic matter, nitrogen and chloride content)

(Siciliano et al., 2014). 19 (Siciliano et al., 2014).

20

A ntarctic microbial diversity is also heavily influenced by landscape connectivity, the 21 Antarctic microbial diversity is also heavily influenced by landscape connectivity, the

presence of vegetation/moss coverage or the presence of birds or mammals (Yergeau et 22 presence of vegetation/moss coverage or the presence of birds or mammals (Yergeau et

al., 2007, Ferrari et al., 2015, Wang et al., 2015). The geomorphology of a landscape 23 al., 2007, Ferrari et al., 2015, Wang et al., 2015). The geomorphology of a landscape

affects the microbial communities and their appearent connectivity (Ferrari et al., 2015, 24 affects the microbial communities and their appearent connectivity (Ferrari et al., 2015,

R ampelotto et al., 2015), with a higher degree of microbial connectivity observed in 25 Rampelotto et al., 2015), with a higher degree of microbial connectivity observed in

soils from continuous landscapes, compared with soils with reduced connectivity due to 26 soils from continuous landscapes, compared with soils with reduced connectivity due to 11

the presence of cryoturbation such as frost boil development (Ferrari et al., 2015). 1 the presence of cryoturbation such as frost boil development (Ferrari et al., 2015).

V egetative coverage strongly affects community structure with distinct communities 2 Vegetative coverage strongly affects community structure with distinct communities

observed between vegetative versus fell-field or barren soils (Yergeau et al., 2007). 3 observed between vegetative versus fell-field or barren soils (Yergeau et al., 2007).

Penguins and seals are commonly observed in coastal regions, thus they also influence 4 Penguins and seals are commonly observed in coastal regions, thus they also influence

soil microbial diversity (Wang et al., 2015), with the presence of penguin colonies 5 soil microbial diversity (Wang et al., 2015), with the presence of penguin colonies

leading to the development of ornithogenic soils (Aislabie et al., 2009). Due to the 6 leading to the development of ornithogenic soils (Aislabie et al., 2009). Due to the

extremely slow rate of soil processes in Antarctica, soils are particularly susceptible to 7 extremely slow rate of soil processes in Antarctica, soils are particularly susceptible to

hum an-induced damage and human activity is considered as the most important factor 8 human-induced damage and human activity is considered as the most important factor

affecting microbial community structures (Delille 2000, Chong et al., 2010, Siciliano et 9 affecting microbial community structures (Delille 2000, Chong et al., 2010, Siciliano et

al., 2014). 10 al., 2014).

11

Fuel spillage has been shown to severely reduce soil microbial diversity (Aislabie et al., 12 Fuel spillage has been shown to severely reduce soil microbial diversity (Aislabie et al.,

2001, Ruberto et al., 2003, Chong et al., 2009, van Dorst et al., 2014, Wang et al., 2015). 13 2001, Ruberto et al., 2003, Chong et al., 2009, van Dorst et al., 2014, Wang et al., 2015).

B ioremediation activities have been carried out across Antarctica. However, 14 activities have been carried out across Antarctica. However,

bioremediation of hydrocarbons enriches for hydrocarbon-degrading bacteria, therefore 15 bioremediation of hydrocarbons enriches for hydrocarbon-degrading bacteria, therefore

leading to furthers shift in microbial diversity as the bioremediation process progresses 16 leading to furthers shift in microbial diversity as the bioremediation process progresses

(Pow ell et al., 2006). Other human activities also decrease soil microbial biodiversity 17 (Powell et al., 2006). Other human activities also decrease soil microbial biodiversity

(Chong et al., 2009, Pan et al., 2013). For example, transportation of vehicles into 18 (Chong et al., 2009, Pan et al., 2013). For example, transportation of vehicles into

A ntarctica have led to the introduction of alien plant and bacterial species (Hughes et al., 19 Antarctica have led to the introduction of alien plant and bacterial species (Hughes et al.,

2010), while human faecal bacteria from sewage outlets have also been introduced in to 20 2010), while human faecal bacteria from sewage outlets have also been introduced in to

the sensitive Antarctic environment (Sjöling et al., 2000). 21 the sensitive Antarctic environment (Sjöling et al., 2000).

22 1.5 Next generation sequencing technologies and

23 expanding our understanding of microbial diversity

24 and function

Prior to high throughput next generation sequencing (NGS) technologies, bacterial 25 Prior to high throughput next generation sequencing (NGS) technologies, bacterial

12 diversity in Antarctica was considered much lower than that of temperate regions (Wall 1 diversity in Antarctica was considered much lower than that of temperate regions (Wall et al., 1999). The advance in sequencing techniques and analytical methods provide 2 et al., 1999). The advance in sequencing techniques and analytical methods provide unm atchable sequencing depth with reduced cost compared with traditional Sanger 3 unmatchable sequencing depth with reduced cost compared with traditional Sanger sequencing methods (van Dorst et al., 2014). It has not only lead to a greater 4 sequencing methods (van Dorst et al., 2014). It has not only lead to a greater understanding of microbial taxonomy diversities and functional capacities, but it has 5 understanding of microbial taxonomy diversities and functional capacities, but it has also enabled us to infer the interaction within microbial communities and nutrient at 6 also enabled us to infer the interaction within microbial communities and nutrient at single-cell level (Sangwan et al., 2016). 7 single-cell level (Sangwan et al., 2016).

8

In contrast to traditional taxonomic classification system based on phenotypic 9 In contrast to traditional taxonomic classification system based on phenotypic characteristics (such as specificity, cell morphology etc.) (Vandamme et al., 10 characteristics (such as substrate specificity, cell morphology etc.) (Vandamme et al.,

1996), the short 16S rRNA gene fragment generated using NGS are grouped in to 11 1996), the short 16S rRNA gene fragment generated using NGS are grouped in to operational taxonomic unit (OTU) (Nguyen et al., 2016) based on sequence identity 12 operational taxonomic unit (OTU) (Nguyen et al., 2016) based on sequence identity cutoffs of, typically 97%, to represent species level classification. NGS overcomes the 13 cutoffs of, typically 97%, to represent species level classification. NGS overcomes the

“great plate count anomaly” (Sizova et al., 2012, Sizova et al., 2012) and has revealed 14 “great plate count anomaly” (Sizova et al., 2012, Sizova et al., 2012) and has revealed the wealth of the uncultured majority, which is now viewed as biology's “microbial dark 15 the wealth of the uncultured majority, which is now viewed as biology's “microbial dark m atter” (MDM) (Hugenholtz et al., 1998, Marcy et al., 2007, Vartoukian et al., 2010). 16 matter” (MDM) (Hugenholtz et al., 1998, Marcy et al., 2007, Vartoukian et al., 2010).

M ost recently, NGS has led to a new view of tree of life, being comprised of 92 named 17 Most recently, NGS has led to a new view of tree of life, being comprised of 92 named bacterial phyla, but more than half of the reported major lineages (55) lack cultured 18 bacterial phyla, but more than half of the reported major lineages (55) lack cultured representatives (red dot in Figure 1.4) (Hug et al., 2016). Furthermore, bacterial phyla 19 representatives (red dot in Figure 1.4) (Hug et al., 2016). Furthermore, bacterial phyla that lack cultured representatives members are defined based on 16S rRNA gene 20 that lack cultured representatives members are defined based on 16S rRNA gene sequences only and are termed “candidate divisions (phyla)” (Hugenholtz et al., 2001). 21 sequences only and are termed “candidate divisions (phyla)” (Hugenholtz et al., 2001).

22

T he direct sequencing of sheared total DNA extracted from a sample (shotgun 23 The direct sequencing of sheared total DNA extracted from a sample (shotgun sequencing) can provide additional genomic functional information, allowing us to infer 24 sequencing) can provide additional genomic functional information, allowing us to infer strategies microbes use to adapt their external environments. Furthermore, complete and 25 strategies microbes use to adapt their external environments. Furthermore, complete and draft genomes of bacterial taxa can now be recovered from short shotgun sequencing 26 draft genomes of bacterial taxa can now be recovered from short shotgun sequencing reads, allowing us to also infer “taxon-specific” potential of an organism (Albertsen et 27 reads, allowing us to also infer “taxon-specific” potential of an organism (Albertsen et

13 al., 2013, Rinke et al., 2013, Hug et al., 2016). To recover individual bacterial genomes 1 al., 2013, Rinke et al., 2013, Hug et al., 2016). To recover individual bacterial genomes from shotgun sequencing reads, the short sequencing reads were first assembled into 2 from shotgun sequencing reads, the short sequencing reads were first assembled into large contigs, then grouped into taxonomic bins (binning) (Albertsen et al., 2013). The 3 large contigs, then grouped into taxonomic bins (binning) (Albertsen et al., 2013). The binning can be based on nucleotide composition, abundance differences across different 4 binning can be based on nucleotide composition, abundance differences across different sam ples, or both (Figure 1.5) (Sangwan et al., 2016). In contrast, single cell genomic 5 samples, or both (Figure 1.5) (Sangwan et al., 2016). In contrast, single cell genomic sequencing relies on the of a single cell from the community using automatic 6 sequencing relies on the isolation of a single cell from the community using automatic

(such as flow cytometry (Raghunathan et al., 2005)) or manual separation (such as 7 (such as flow cytometry (Raghunathan et al., 2005)) or manual separation (such as m icromanipulation (Marcy et al., 2007)). The DNA from each cell was extracted and the 8 micromanipulation (Marcy et al., 2007)). The DNA from each cell was extracted and the w hole genome was amplified using multiple displacement amplification. Then the 9 whole genome was amplified using multiple displacement amplification. Then the am plified genome was sequenced using NGS and assembled as described in shotgun 10 amplified genome was sequenced using NGS and assembled as described in shotgun genom ic (Lasken 2007). 11 genomic (Lasken 2007). 12

14

1

2 Figure 1.4 The most recent tree of life described from NGS data.

T he tree is comprised of 92 named bacterial phyla, 26 archaeal phyla and all five of the 3 The tree is comprised of 92 named bacterial phyla, 26 archaeal phyla and all five of the

E ukaryotic supergroups. The name of lineages with cultured representative were 4 Eukaryotic supergroups. The name of lineages with cultured representative were

italicized and lineages lacking an isolated representative are highlighted with 5 italicized and lineages lacking an isolated representative are highlighted with

non-italicized names and red dots (reproduced from Hug et al., 2016). 6 non-italicized names and red dots (reproduced from Hug et al., 2016).

7

15

To date, Bacterial genomes have been recovered from a wide range of environments, 1 To date, Bacterial genomes have been recovered from a wide range of environments,

including acid mine drainage (Tyson et al., 2004), human gut (Sharon et al., 2013), 2 including acid mine drainage (Tyson et al., 2004), human gut (Sharon et al., 2013),

ocean (Iverson et al., 2012), activated sludge bioreactor (Albertsen et al., 2013), aquifer 3 ocean (Iverson et al., 2012), activated sludge bioreactor (Albertsen et al., 2013), aquifer

(Brow n et al., 2015), while, the genome re-construction from soils is more problematic, 4 (Brown et al., 2015), while, the genome re-construction from soils is more problematic,

due to the high diversity and heterogeneity of soil microbial community (Howe et al., 5 due to the high diversity and heterogeneity of soil microbial community (Howe et al.,

2014). The most notable was the recovery of complete circular genome of candidate 6 2014). The most notable was the recovery of complete circular genome of candidate

phyla TM7 from bioreactor and renamed candidate phylum TM7 to Saccharibacteria in 7 phyla TM7 from bioreactor and renamed candidate phylum TM7 to Saccharibacteria in

2013 (Albertsen et al., 2013). In the same year, complete circular genomes for candidate 8 2013 (Albertsen et al., 2013). In the same year, complete circular genomes for candidate

phyla SR1, WW E3 were also recovered from metagenomic data (Albertsen et al., 2013, 9 phyla SR1, WWE3 were also recovered from metagenomic data (Albertsen et al., 2013,

K antor et al., 2013). Since 2013, numerous studies using single-cell or metagenomics 10 Kantor et al., 2013). Since 2013, numerous studies using single-cell or metagenomics

com bined with binning approaches have characterised many uncultured taxa. Most 11 combined with binning approaches have characterised many uncultured taxa. Most

recently, this approach has lead to the discovery of the candidate phyla radiation (CPR) 12 recently, this approach has lead to the discovery of the candidate phyla radiation (CPR)

group, which is a group of small genome (often <1 Mb) bacterial candidate phyla that 13 group, which is a group of small genome (often <1 Mb) bacterial candidate phyla that

share an evolutionary history (Brown et al., 2015). The members of CPR were incapable 14 share an evolutionary history (Brown et al., 2015). The members of CPR were incapable

biosynthesis of nucleotides, lipids and most amino acids (Kantor et al., 2013). 15 of de no biosynthesis of nucleotides, lipids and most amino acids (Kantor et al., 2013).

A s a result, the CPR is mostly likely to be symbionts and is dependent on the bacteria 16 As a result, the CPR is mostly likely to be symbionts and is dependent on the bacteria

w ithin the surrounding community for cellular metabolism (Solden et al., 2016). It has 17 within the surrounding community for cellular metabolism (Solden et al., 2016). It has

also been proposed that given the limited metabolic capacity and a strong dependency 18 also been proposed that given the limited metabolic capacity and a strong dependency

of surrounding host bacteria, traditional cultivation methods using high nutrient agar 19 of surrounding host bacteria, traditional cultivation methods using high nutrient agar

should be revised and co-culture or cultivating in the natural environment could provide 20 should be revised and co-culture or cultivating in the natural environment could provide

a alternative way to culture and thus investigate physiological features of the CPR 21 a alternative way to culture and thus investigate physiological features of the CPR

22 lineages (Solden et al., 2016). 23

16

1 2 Figure 1.5 Workflow and overview for recovering genome bins from shotgun

(reproduced from Sangwan et al., 2016). 3 metagenomics data (reproduced from Sangwan et al., 2016).

4

5

17

1 1.6 The Windmill Islands, an under-explored ice free

2 region

T he Windmill Islands region in East Antarctica (centered 66°15'S, 110°33'E) is located 3 The Windmill Islands region in East Antarctica (centered 66°15'S, 110°33'E) is located

in the midst of Wilkes Land within the Australian Antarctic Territory (Figure 1.6). The 4 in the midst of Wilkes Land within the Australian Antarctic Territory (Figure 1.6). The

region is comprised of five major areas, Clark, Bailey, Mitchell and Browning Peninsula 5 region is comprised of five major areas, Clark, Bailey, Mitchell and Browning Peninsula

and Robinson Ridge, as well as a number of islands forming ice-free oasis areas 6 and Robinson Ridge, as well as a number of islands forming ice-free oasis areas

surrounding the region (Goodwin 1993). Casey station, a permanent research station is 7 surrounding the region (Goodwin 1993). Casey station, a permanent research station is

located in Bailey Peninsula and has been managed by Australian Antarctic Division 8 located in Bailey Peninsula and has been managed by Australian Antarctic Division

since 1988. The mean temperature at Casey stations for the warmest and the coldest 9 since 1988. The mean temperature at Casey stations for the warmest and the coldest

m onths are 0.3°C and -14.9°C, respectively, while the extremes of temperature range 10 months are 0.3°C and -14.9°C, respectively, while the extremes of temperature range

from 9.2 °C and -41 °C (Azmi et al., 1998, Beyer et al., 2000). Annual precipitation is 11 from 9.2 °C and -41 °C (Azmi et al., 1998, Beyer et al., 2000). Annual precipitation is

175 m m (rainfall equivalent), which falls primarily as snow, but rain may occasionally 12 175 mm (rainfall equivalent), which falls primarily as snow, but rain may occasionally

fall during the summer months (Beyer et al., 2000). 13 fall during the summer months (Beyer et al., 2000).

14

Fu ngi and bryophytes have been studied extensively using cultivation and field 15 Fungi and bryophytes have been studied extensively using cultivation and field

and over 90 terrestrial 16 observation methods across all of the major peninsula and islands and over 90 terrestrial

and aquatic algae have been identified (Ling et al., 1990, Melick et al., 1994, Azmi et al., 17 and aquatic algae have been identified (Ling et al., 1990, Melick et al., 1994, Azmi et al.,

1997, Azmi et al., 1998). However, limited microbial community investigations have 18 1997, Azmi et al., 1998). However, limited microbial community investigations have

been conducted. Soils around Casey station have been most frequently investigated, but 19 been conducted. Soils around Casey station have been most frequently investigated, but

they have primarily focused on the effects of petroleum contamination on soil microbial 20 they have primarily focused on the effects of petroleum contamination on soil microbial

diversity (Powell et al., 2005, Harvey et al., 2012, Richardson et al., 2015). Currently, 21 diversity (Powell et al., 2005, Harvey et al., 2012, Richardson et al., 2015). Currently,

there was only one study reporting on the bacterial community diversity in pristine soils 22 there was only one study reporting on the bacterial community diversity in pristine soils

at Casey station, Mitchell Peninsula, Browning Peninsula, Antarctic Specially 23 at Casey station, Mitchell Peninsula, Browning Peninsula, Antarctic Specially

Protection Areas (135 and 136) using denaturing gradient gel electrophoresis (DGGE) 24 Protection Areas (135 and 136) using denaturing gradient gel electrophoresis (DGGE)

(Chong et al., 2009), while the microbial biomass in Browning Peninsula have been 25 (Chong et al., 2009), while the microbial biomass in Browning Peninsula have been

studied using a range of methods including: qPCR, fatty acid profiling analysis (Stewart 26 studied using a range of methods including: qPCR, fatty acid profiling analysis (Stewart

18

et al., 2012), microscopy counting, ATP level estimation and plate counts methods 1 et al., 2012), microscopy counting, ATP level estimation and plate counts methods

(Roser et al., 1993). 2 (Roser et al., 1993).

3

In a large investigation of soil microbial diversity across both polar regions, the 4 In a large investigation of soil microbial diversity across both polar regions, the

bacterial and fungal richness and diversity was reported to be shaped by fertility and pH 5 bacterial and fungal richness and diversity was reported to be shaped by fertility and pH

(Siciliano et al., 2014), while microbial community assembly and connectivity at the 6 (Siciliano et al., 2014), while microbial community assembly and connectivity at the

regional scale was also heavily influenced by geological connectivity (Ferrari et al., 7 regional scale was also heavily influenced by geological connectivity (Ferrari et al.,

Though the fine-scale community taxonomic composition of the Windmill 8 2015). Though the fine-scale community taxonomic composition of the Windmill

Islands is largely unknown, we recently reported the phylum-level taxonomic 9 Islands is largely unknown, we recently reported the phylum-level taxonomic

com position at Mitchell Peninsula, Robinson Ridge, Browning Peninsula, Herring 10 composition at Mitchell Peninsula, Robinson Ridge, Browning Peninsula, Herring

Island and Casey station (Siciliano et al., 2014, Ferrari et al., 2015). In these studies, a 11 Island and Casey station (Siciliano et al., 2014, Ferrari et al., 2015). In these studies, a

high relative abundance (> 2%) of candidate phyla WPS-2 and AD3 was detected within 12 high relative abundance (> 2%) of candidate phyla WPS-2 and AD3 was detected within

both Robinson Ridge and Mitchell Peninsula. 13 both Robinson Ridge and Mitchell Peninsula.

14

M itchell Peninsula (66°20′S 110°32′E) is located approximately 5 km from Casey 15 Mitchell Peninsula (66°20′S 110°32′E) is located approximately 5 km from Casey

station, while Robinson Ridge (66°22′S 110°36′E) is approximately 5 km further 16 station, while Robinson Ridge (66°22′S 110°36′E) is approximately 5 km further

south-east from Mitchell Peninsula (Figure 1.6). The rocky and sandy soils of both 17 south-east from Mitchell Peninsula (Figure 1.6). The rocky and sandy soils of both

Peninsula share similar soil physicochemical properties, the soils are acidic to neutral 18 Peninsula share similar soil physicochemical properties, the soils are acidic to neutral

(pH 4.8-7.05 for Mitchell Peninsula and 5.0-7.1 for Robinson Ridge, respectively) and 19 (pH 4.8-7.05 for Mitchell Peninsula and 5.0-7.1 for Robinson Ridge, respectively) and

low in carbon (<0.5%), nitrogen (0.05-0.2% and 0.03-0.09%) and phosphorous 20 low in carbon (<0.5%), nitrogen (0.05-0.2% and 0.03-0.09%) and phosphorous

21 (0.05-0.1% and 0.17-0.37) (Melick et al., 1994, Azmi et al., 1998, Siciliano et al., 2014).

22

Several studies on the diversity of fungi and mosses have been conducted across both 23 Several studies on the diversity of fungi and mosses have been conducted across both

sites (Ling et al., 1990, Azmi et al., 1998, McRae et al., 1999a, McRae et al., 1999b), 24 sites (Ling et al., 1990, Azmi et al., 1998, McRae et al., 1999a, McRae et al., 1999b),

25 with lichens Umbilicaria decussate, Pseudephebe minuscule, Usnea sphacelata and

found to dominate (Chong et al., 2009). To my knowledge, only one study 26 Buellia frigid found to dominate (Chong et al., 2009). To my knowledge, only one study

has investigated the bacterial community diversity and distribution at the local scale of 27 has investigated the bacterial community diversity and distribution at the local scale of

19

M itchell Peninsula (Chong et al., 2009), while the bacterial community diversity in 1 Mitchell Peninsula (Chong et al., 2009), while the bacterial community diversity in

R obinson Ridge remains to be explored. 2 Robinson Ridge remains to be explored.

3 4 Figure 1.6 Map of Windmill Islands showing the five major peninsulas, the Clark,

5 Bailey, Mitchell and Browning Peninsulas and Robinson Ridge.

T he map insert indicates the location of Casey station and highlights the claimed 6 The map insert indicates the location of Casey station and highlights the claimed

A ustralian Antarctic Territory. Modified from the map for Windmill Islands with 7 Australian Antarctic Territory. Modified from the map for Windmill Islands with

A ntarctica inset, accessed from 8 Antarctica inset, accessed from

https://data.aad.gov.au/aadc/mapcat/search_mapcat_results.cfm 9 https://data.aad.gov.au/aadc/mapcat/search_mapcat_results.cfm

10

11

20

1 1.7 Aims

T he microbial diversity in Antarctic soils from Windmill Islands region are largely 2 The microbial diversity in Antarctic soils from Windmill Islands region are largely

unexplored. However, NGS at a regional scale has revealed candidate phyla WPS-2 and 3 unexplored. However, NGS at a regional scale has revealed candidate phyla WPS-2 and

A D3 to be abundant in both Robinson Ridge and Mitchell Peninsula. While 4 AD3 to be abundant in both Robinson Ridge and Mitchell Peninsula. While

regional-scale investigation has shown soil fertility and pH to be important drivers of 5 regional-scale investigation has shown soil fertility and pH to be important drivers of

soil communities within the Windmill Islands, very little is known about the microbial 6 soil communities within the Windmill Islands, very little is known about the microbial

diversity at local scales. 7 diversity at local scales.

8

T he aim of my PhD was to characterise the microbial diversity and functional capacity 9 The aim of my PhD was to characterise the microbial diversity and functional capacity

of the under-explored Eastern Antarctic sites (Mitchell Peninsula and Robinson Ridge) 10 of the under-explored Eastern Antarctic sites (Mitchell Peninsula and Robinson Ridge)

w ith a focus on microbial dark matter using up to date NGS technologies of amplicon 11 with a focus on microbial dark matter using up to date NGS technologies of amplicon

and shotgun sequencing. Specifically, I will: 12 and shotgun sequencing. Specifically, I will:

1) Use multiplex 454 pyrosequencing to investigate the taxonomic assemblage of 13 1) Use multiplex 454 pyrosequencing to investigate the taxonomic assemblage of

bacteria and fungi within Mitchell Peninsula, and to explore correlations between 14 bacteria and fungi within Mitchell Peninsula, and to explore correlations between

abiotic and biotic soil parameters and the resulting community structure across this 15 abiotic and biotic soil parameters and the resulting community structure across this

site. 16 site.

2) Use shotgun sequencing and differential coverage binning to re-construct genomes 17 2) Use shotgun sequencing and differential coverage binning to re-construct genomes

of the uncultured bacteria and archaea present in Robinson Ridge soils with the aim 18 of the uncultured bacteria and archaea present in Robinson Ridge soils with the aim

of identifying the functional potential of unknown taxa/representatives of candidate 19 of identifying the functional potential of unknown taxa/representatives of candidate

lineages in this underexplored ecosystem. 20 lineages in this underexplored ecosystem.

3) Investigate the functional capacity of candidate division bacteria WPS-2 resolving 21 3) Investigate the functional capacity of candidate division bacteria WPS-2 resolving

the phylogeny and key environmental parameters driving the WPS-2 community 22 the phylogeny and key environmental parameters driving the WPS-2 community

distribution across the Windmill Islands. 23 distribution across the Windmill Islands.

24

21

1 Chapter 2 Microbial diversity at Mitchell

2 Peninsula, East Antarctica: a potential

3 microbial dark matter “hotspot”

T he soil collection and physiochemical analysis were carried out by the AAD, DNA 4 The soil collection and physiochemical analysis were carried out by the AAD, DNA

extraction was performed by Josie van Dorst, the processing of 454 pyrosequencing 5 extraction was performed by Josie van Dorst, the processing of 454 pyrosequencing

data, data analysis, bacterial 16S and fungal 18S rRNA gene qPCR, statistical analysis 6 data, data analysis, bacterial 16S and fungal 18S rRNA gene qPCR, statistical analysis

and network analysis were performed by the candidate. 7 and network analysis were performed by the candidate.

8 2.1 INTRODUCTION

M itchell Peninsula (66º31’ S, 110º59’ E), is an under-explored polar desert in East 9 Mitchell Peninsula (66º31’ S, 110º59’ E), is an under-explored polar desert in East

A ntarctica. It is located approximately 5 km from Casey station and is not in vicinity of 10 Antarctica. It is located approximately 5 km from Casey station and is not in vicinity of

bird or seal colonies (Chong et al., 2009). Limited knowledge is available on the 11 bird or seal colonies (Chong et al., 2009). Limited knowledge is available on the

m icrobial diversity of this pristine environment. The bacterial diversity of Mitchell 12 microbial diversity of this pristine environment. The bacterial diversity of Mitchell

Peninsula has only been investigated using DGGE, in which only six bacterial ecotypes 13 Peninsula has only been investigated using DGGE, in which only six bacterial ecotypes

belonging to Bacteroidetes and Proteobacteria were identified (Chong et al., 2009). In 14 belonging to Bacteroidetes and Proteobacteria were identified (Chong et al., 2009). In

addition, fungal community diversity at this site has been investigated by field 15 addition, fungal community diversity at this site has been investigated by field

observation and traditional cultivation only, with four fungal isolates belonging to the 16 observation and traditional cultivation only, with four fungal isolates belonging to the

phylum Ascom ycota identified (Azmi and Seppelt 1998; McRae et al., 1999b). 17 phylum Ascomycota identified (Azmi and Seppelt 1998; McRae et al., 1999b).

18

M ore recently, NGS revealed a high abundance of candidate phyla WPS-2 and AD3 19 More recently, NGS revealed a high abundance of candidate phyla WPS-2 and AD3

w ithin soils from Mitchell Peninsula (Siciliano, et al., 2015). However, a comprehensive 20 within soils from Mitchell Peninsula (Siciliano, et al., 2015). However, a comprehensive

investigation into the bacterial and fungal diversity has not been carried out. It has also 21 investigation into the bacterial and fungal diversity has not been carried out. It has also

been demonstrated that soil fertility and pH are the major drivers of polar soil microbial 22 been demonstrated that soil fertility and pH are the major drivers of polar soil microbial

richness and diversity (Siciliano, et al., 2015), but the specific environmental drivers at 23 richness and diversity (Siciliano, et al., 2015), but the specific environmental drivers at

22

M itchell Peninsula is yet to be identified. The well-connected terrain slope of Mitchell 1 Mitchell Peninsula is yet to be identified. The well-connected terrain slope of Mitchell

Peninsula has also been shown to contain a complex network of microbial associations, 2 Peninsula has also been shown to contain a complex network of microbial associations,

but, the dynamics between bacteria and fungi communities requires further investigation 3 but, the dynamics between bacteria and fungi communities requires further investigation

(Ferrari et al., 2015). 4 (Ferrari et al., 2015).

5

T he aim of this chapter was to investigate the abundance and the taxonomic assemblage 6 The aim of this chapter was to investigate the abundance and the taxonomic assemblage

of bacteria and fungi within Mitchell Peninsula soils, and to identify the microphysical 7 of bacteria and fungi within Mitchell Peninsula soils, and to identify the microphysical

and environmental factors that shape the microbial community structure at the local 8 and environmental factors that shape the microbial community structure at the local

scale. 9 scale.

10 2.2 MATERIAL AND METHODS

11 2.2.1 Soil sampling

Surface soil samples (0-10 cm depth) were collected from Mitchell Peninsula (66º31´S, 12 Surface soil samples (0-10 cm depth) were collected from Mitchell Peninsula (66º31´S,

December 2006 and stored at -20 °C until analysed in 2011. A total of 13 110º59´E) on 8th December 2006 and stored at -20 °C until analysed in 2011. A total of

93 samples were collected from three 300 m long parallel transects, the samples along 14 93 samples were collected from three 300 m long parallel transects, the samples along

each transect had a variable lag distance, ranging from 0.1 m to 50 m (Figure 2.1)(van 15 each transect had a variable lag distance, ranging from 0.1 m to 50 m (Figure 2.1)(van

D orst et al., 2014). The soil samples were divided into three groups based on their 16 Dorst et al., 2014). The soil samples were divided into three groups based on their

location on the transects, which reflected a single contiguous soil catena: a bottom site 17 location on the transects, which reflected a single contiguous soil catena: a bottom site

(elevation below 20 m, slope between 7-10 degrees), a middle site (elevation 30-40 m, 18 (elevation below 20 m, slope between 7-10 degrees), a middle site (elevation 30-40 m,

slope 11-13 degrees) and a top site (elevation 51-54 m, slope 2-7 degrees). 19 slope 11-13 degrees) and a top site (elevation 51-54 m, slope 2-7 degrees).

20 2.2.2 Physicochemical analyses

C omprehensive geographical and chemical characteristics of the soils were measured as 21 Comprehensive geographical and chemical characteristics of the soils were measured as

described in van Dorst et al (2014) and Siciliano et al (2014). In brief, geographical 22 described in van Dorst et al (2014) and Siciliano et al (2014). In brief, geographical

parameters measured include longitude, latitude, slope, elevation, aspect; soil chemical 23 parameters measured include longitude, latitude, slope, elevation, aspect; soil chemical

- - - - 3- , 24 and chemical parameters include water extractable ions (Cl , NO2 , Br , NO3 , PO4 , 23

2- + + 3- , elements 1 SO4 and NH4 ), KCl extractable NH4 , bicarbonate extractable PO4 , elements

O , 2 analysed by X-ray fluorescence (SiO2, TiO2, Al2O3, Fe2O3, MnO, MgO, CaO, Na2O,

5+ + 2+ 2+ 2+ 3+ 2- 2+ 3+ , 3 K2O, P2O5, SO3, Cl), exchangeable ions (P , K , Ca , Mg , Zn , B , S , Cu , Fe , ) relative to the cation exchange capacity, as well as water content, 4 Mn2+, Na+, Al3+, K+) relative to the cation exchange capacity, as well as water content,

pH , total carbon, total Kjeldahl-digested N and total Kjeldahl-digested P, cation 5 pH, total carbon, total Kjeldahl-digested N and total Kjeldahl-digested P, cation

exchange capacity. Soil physical parameters including conductivity, grain size and sand, 6 exchange capacity. Soil physical parameters including conductivity, grain size and sand,

gravel, mud (including silt and clay) percentages were also measured. 7 gravel, mud (including silt and clay) percentages were also measured.

8 2.2.3 DNA extraction

D NA was extracted from a portion of each soil sample (0.25 - 0.3 g) using the FastDNA 9 DNA was extracted from a portion of each soil sample (0.25 - 0.3 g) using the FastDNA

SP IN kit for soil (MP Biomedicals, NSW, Australia) following the manufacturer’s 10 SPIN kit for soil (MP Biomedicals, NSW, Australia) following the manufacturer’s

instructions. All DNA extracts were extracted in triplicate and quantified with Picogreen 11 instructions. All DNA extracts were extracted in triplicate and quantified with Picogreen

(Life Technologyes, Australia) on a fluorescence plate reader (SpectraMax M3 12 (Life Technologyes, Australia) on a fluorescence plate reader (SpectraMax M3

M ulti-Mode Microplate Reader, Molecular Devices, USA). Sample DNA consistency 13 Multi-Mode Microplate Reader, Molecular Devices, USA). Sample DNA consistency

between technical replicates was evaluated using automated ribosomal intergenic spacer 14 between technical replicates was evaluated using automated ribosomal intergenic spacer

analysis (ARISA) as described previously (van Dorst et al., 2014) and a representative 15 analysis (ARISA) as described previously (van Dorst et al., 2014) and a representative

D NA extract from each soil was selected for pyrosequencing. 16 DNA extract from each soil was selected for pyrosequencing.

17 2.2.4 Multiplex 454 pyrosequencing and data processing

M ultiplex tagged pyrosequencing was performed by the Research and Testing 18 Multiplex tagged pyrosequencing was performed by the Research and Testing

L aboratory (Lubbock, Texas, USA) using the 454 FLX titanium platform (Roche, 19 Laboratory (Lubbock, Texas, USA) using the 454 FLX titanium platform (Roche,

B ranford, CT, USA). Bacterial and fungal assemblages were taxonomically 20 Branford, CT, USA). Bacterial and fungal assemblages were taxonomically

characterised by sequencing of the bacterial SSU RNA genes amplified using primers 21 characterised by sequencing of the bacterial SSU RNA genes amplified using primers

27F (5’-GAGTTTGATC NTG GC TCA -3’) and 519R 22 27F (5’-GAGTTTGATCNTGGCTCA-3’) and 519R

(5’-GTNT TAC NGC GG CKG CTG -3’) (Quere et al., 2005) and the fungal ITS region 23 (5’-GTNTTACNGCGGCKGCTG-3’) (Quere et al., 2005) and the fungal ITS region

using primers ITS1F (5’-TCCGTAGGTGA AC CTG CGG -3’) (Gardes and Bruns 1993) 24 using primers ITS1F (5’-TCCGTAGGTGAACCTGCGG-3’) (Gardes and Bruns 1993)

and ITS4R (5’-TCCTCCGCTTAT TGA TATG C-3’) (White et al., 1990), respectively. 25 and ITS4R (5’-TCCTCCGCTTATTGATATGC-3’) (White et al., 1990), respectively. 24

Sequence data were analysed using the MOTHUR pipeline (Schloss et al., 2009). Raw 1 Sequence data were analysed using the MOTHUR pipeline (Schloss et al., 2009). Raw data were provided as standard flowgram format files and were extracted and 2 data were provided as standard flowgram format files and were extracted and error-checked via the Pyronoise algorithm (Quince et al., 2011). Sequence data were 3 error-checked via the Pyronoise algorithm (Quince et al., 2011). Sequence data were further quality screened by removing short reads (<150 bp), long homopolymers (>8 4 further quality screened by removing short reads (<150 bp), long homopolymers (>8 repeats) and truncation of SSU RNA gene reads (>450 bp). 5 repeats) and truncation of SSU RNA gene reads (>450 bp).

6

B acterial sequences were aligned to the curated SILVA secondary structure alignment 7 Bacterial sequences were aligned to the curated SILVA secondary structure alignment

(Pruesse et al., 2007) and then clustered into operational taxonomic units (OTUs) based 8 (Pruesse et al., 2007) and then clustered into operational taxonomic units (OTUs) based on 96% sequence identity (Kim et al., 2011). The taxonomic position of identified 9 on 96% sequence identity (Kim et al., 2011). The taxonomic position of identified bacterial OTUs was assigned using the Greengenes database (2013 May version) 10 bacterial OTUs was assigned using the Greengenes database (2013 May version)

(M cDonald et al., 2012, Werner et al., 2012) that was trimmed to the same region as the 11 (McDonald et al., 2012, Werner et al., 2012) that was trimmed to the same region as the am plicons (V1-V3). The sequence data was checked for chimeras using UCHIME 12 amplicons (V1-V3). The sequence data was checked for chimeras using UCHIME algorithm (Edgar et al., 2011) and then were preclustered at 1% to account for 454’s 13 algorithm (Edgar et al., 2011) and then were preclustered at 1% to account for 454’s titanium instrument error rate. An OTU abundance-by sample matrix was generated 14 titanium instrument error rate. An OTU abundance-by sample matrix was generated from the bacterial dataset and then the singletons were removed from sample matrix. 15 from the bacterial dataset and then the singletons were removed from sample matrix.

D atasets were initially subsampled to equal depth; however, after determining no 16 Datasets were initially subsampled to equal depth; however, after determining no significant impact on the alpha diversity indices between the sub-sampled and 17 significant impact on the alpha diversity indices between the sub-sampled and

>0.05), and previous reports have suggested that 18 non-subsampled datasets (t-test, p>0.05), and previous reports have suggested that subsampling can cause suboptimal use of the information contained in the data set (de 19 subsampling can cause suboptimal use of the information contained in the data set (de

C árcer et al., 2011, McMurdie and Holmes, 2014,), therefore the dataset was left intact. 20 Cárcer et al., 2011, McMurdie and Holmes, 2014,), therefore the dataset was left intact.

21

Fo r fungal amplicons, after the ITS1 region was extracted from the amplicon using 22 For fungal amplicons, after the ITS1 region was extracted from the amplicon using software package developed by Nilsson (Bengtsson-Palme et al., 2013), USEARCH 23 software package developed by Nilsson (Bengtsson-Palme et al., 2013), USEARCH software package (Edgar 2010) was used to cluster fungal ITS1 sequences at 97% 24 software package (Edgar 2010) was used to cluster fungal ITS1 sequences at 97% sequence identity to loosely define OTUs. The representative sequence were picked by 25 sequence identity to loosely define OTUs. The representative sequence were picked by

M AT TF v6 (Katoh and Toh 2008) and they were then compared against UNITE fungal 26 MATTF v6 (Katoh and Toh 2008) and they were then compared against UNITE fungal

ITS database (Koljalg et al., 2005) by BLASTn (Altschul et al., 1990) and the top 15 27 ITS database (Koljalg et al., 2005) by BLASTn (Altschul et al., 1990) and the top 15

25

m atches were retrieved. To correct the clustering error, OTUs that shared over 50% 1 matches were retrieved. To correct the clustering error, OTUs that shared over 50%

identical top matches were grouped together manually. The resulting OTU abundance 2 identical top matches were grouped together manually. The resulting OTU abundance

m atrix was generated from the fungal community data with QIIME package (Caporaso 3 matrix was generated from the fungal community data with QIIME package (Caporaso

et al., 2010). 4 et al., 2010).

5

6

7 8 Figure 2.1 Map of Mitchell Peninsula, Windmill Islands, East Antarctica. T ransect and sample locations at Mitchell Peninsula. The sampling sites, sample groups 9 Transect and sample locations at Mitchell Peninsula. The sampling sites, sample groups

and elevations are marked and the relative position of the area on the continent is shown 10 and elevations are marked and the relative position of the area on the continent is shown

on the inset. The photo on the right was taken at the top sampling location, which 11 on the inset. The photo on the right was taken at the top sampling location, which

show ed the soil texture at Mitchell Peninsula and distance of the sampling site to the 12 showed the soil texture at Mitchell Peninsula and distance of the sampling site to the

26

ocean. 1 ocean.

2 2.2.5 Quantitative PCR (qPCR) targeting bacterial SSU RNA

3 and fungal SSU RNA genes

Fo r qPCR, representative DNA extracts from each of three elevation groups were 4 For qPCR, representative DNA extracts from each of three elevation groups were selected (i.e., 0, 2, 100, 102, 200, 202 m from each transect origin), which represented 5 selected (i.e., 0, 2, 100, 102, 200, 202 m from each transect origin), which represented the bottom, middle and top sites. The bacterial and fungal loads were determined by the 6 the bottom, middle and top sites. The bacterial and fungal loads were determined by the copy numbers of bacterial and fungal SSU rRNA gene using published primer pairs 7 copy numbers of bacterial and fungal SSU rRNA gene using published primer pairs E ub338f/Eub518r (Fierer et al., 2005) and FR1/FF390 (Christen et al., 2011), 8 Eub338f/Eub518r (Fierer et al., 2005) and FR1/FF390 (Christen et al., 2011), respectively. Each DNA extract was analysed in triplicate using the Quantifast SYBR 9 respectively. Each DNA extract was analysed in triplicate using the Quantifast SYBR green PCR kit (QIAGEN, Australia) on a Bio-Rad CFX96 Real-Time system (Bio RAD, 10 green PCR kit (QIAGEN, Australia) on a Bio-Rad CFX96 Real-Time system (Bio RAD, A ustralia). A five-point standard curve (in triplicate) was prepared for both bacterial 11 Australia). A five-point standard curve (in triplicate) was prepared for both bacterial SS U RNA and fungal SSU RNA genes using the corresponding PCR product at 12 SSU RNA and fungal SSU RNA genes using the corresponding PCR product at different DNA concentrations. Quantification data were collected only when no 13 different DNA concentrations. Quantification data were collected only when no am plification in the “no-template control” was observed, a single peak in the melt curve 14 amplification in the “no-template control” was observed, a single peak in the melt curve consistent with specific amplification and a reaction efficiency of 100 ± 10%. Data 15 consistent with specific amplification and a reaction efficiency of 100 ± 10%. Data analysis was carried out using CFX Manager software (Bio RAD). 16 analysis was carried out using CFX Manager software (Bio RAD).

17 2.2.6 Statistical analysis

A lpha diversity indices for both bacteria and fungi, including Good’s Coverage, Species 18 Alpha diversity indices for both bacteria and fungi, including Good’s Coverage, Species O bserved (SOB), Chao 1 and Shannon index for each sample group was calculated 19 Observed (SOB), Chao 1 and Shannon index for each sample group was calculated using MOTH UR. Multivariate analysis was conducted using PRIMER 6 with the 20 using MOTHUR. Multivariate analysis was conducted using PRIMER 6 with the PE RM AN OVA + package (Clarke 1993). The soil parameters were log or square root 21 PERMANOVA+ package (Clarke 1993). The soil parameters were log or square root

) 22 transformed to eliminate the parameters skewness and then normalised ((xi -x)/SD) (van Dorst et al., 2014). OTU relative abundance matrices of bacteria and fungi at 23 (van Dorst et al., 2014). OTU relative abundance matrices of bacteria and fungi at various taxonomic levels were square root transformed or converted into 24 various taxonomic levels were square root transformed or converted into presence/absence matrix to confirm whether the differentiation in samples was due to 25 presence/absence matrix to confirm whether the differentiation in samples was due to the relative abundance or taxonomic structure variations. The transformed matrices 26 the relative abundance or taxonomic structure variations. The transformed matrices w ere used to calculate similarity resemblance metrics with Bray-Curtis distance 27 were used to calculate similarity resemblance metrics with Bray-Curtis distance coefficients. Principal coordination analyses (PCoA) were performed using 28 coefficients. Principal coordination analyses (PCoA) were performed using

27

PE RM AN OVA + package of PRIMER6 to visualise the ordering of samples based on 1 PERMANOVA+ package of PRIMER6 to visualise the ordering of samples based on either a) normalised soil parameters, b) bacterial or c) fungal community similarity at 2 either a) normalised soil parameters, b) bacterial or c) fungal community similarity at various taxonomic levels. The environmental parameters, bacterial or fungal taxa that 3 various taxonomic levels. The environmental parameters, bacterial or fungal taxa that best explained the sample distribution (Pearson’s correlation > 0.6 for bacteria and 4 best explained the sample distribution (Pearson’s correlation > 0.6 for bacteria and fungi; >0.8 for normalized soil parameters) were overlaid on the PCO plots. The 5 fungi; >0.8 for normalized soil parameters) were overlaid on the PCO plots. The potential to distinguish the three different sample groups (bottom, middle, top) was 6 potential to distinguish the three different sample groups (bottom, middle, top) was evaluated with ANOSIM (analysis of similarity) using 999 permutations. To find the 7 evaluated with ANOSIM (analysis of similarity) using 999 permutations. To find the environmental and geographical parameters that best explain the variability in the 8 environmental and geographical parameters that best explain the variability in the biodiversity data, a non-parametric multiple regression analysis was performed using 9 biodiversity data, a non-parametric multiple regression analysis was performed using the DISTLM (distance-based linear modelling) function with 999 permutations and a 10 the DISTLM (distance-based linear modelling) function with 999 permutations and a p-value of 0.05 was considered significant. 11 p-value of 0.05 was considered significant.

12 2.2.7 Network analysis

N etwork analysis on the bacterial and fungal OTUs was performed to identify potential 13 Network analysis on the bacterial and fungal OTUs was performed to identify potential associations between dominant members of the microbial community. Bacterial and 14 associations between dominant members of the microbial community. Bacterial and fungal OTUs that had a minimum relative abundance of 0.1% and were observed in 15 fungal OTUs that had a minimum relative abundance of 0.1% and were observed in greater than 70% of all samples within any of the three sampling groups were selected 16 greater than 70% of all samples within any of the three sampling groups were selected for network analysis. For network inference, all possible Spearman’s correlations 17 for network analysis. For network inference, all possible Spearman’s correlations am ong the relative abundances of bacterial and fungal OTUs were calculated using the 18 among the relative abundances of bacterial and fungal OTUs were calculated using the H misc library (accessed January 20, 2014; 19 Hmisc library (accessed January 20, 2014; http://cran.r-project.org/web/packages/Hmisc/index.html) in the R environment 20 http://cran.r-project.org/web/packages/Hmisc/index.html) in the R environment < 0.05) were retained. 21 (http://www.r-project.org) and only significant correlations (p < 0.05) were retained. T he Spearman’s correlation matrix was converted into a network with a correlation 22 The Spearman’s correlation matrix was converted into a network with a correlation value of 0.6 using the igraph library in the R environment (Csardi and Nepusz 2006) 23 value of 0.6 using the igraph library in the R environment (Csardi and Nepusz 2006) and the networks were visualized using Cytoscape 3 (Shannon et al., 2003). The nodes 24 and the networks were visualized using Cytoscape 3 (Shannon et al., 2003). The nodes in the reconstructed network represent the OTUs at 96% identity, whereas the edges 25 in the reconstructed network represent the OTUs at 96% identity, whereas the edges correspond to a strong and significant correlation between nodes. To describe the 26 correspond to a strong and significant correlation between nodes. To describe the topology of the networks, a set of measurements (degree, average degree, clustering 27 topology of the networks, a set of measurements (degree, average degree, clustering coefficient, network diameter, network density and network heterogeneity) were 28 coefficient, network diameter, network density and network heterogeneity) were calculated using Cytoscape 3 (Shannon et al., 2003). 29 calculated using Cytoscape 3 (Shannon et al., 2003).

28

1 2.3 RESULTS

2 2.3.1 Soil properties

T he studied soils were slightly acidic (pH ranged from 4.82 to 6.8), dry at the time of 3 The studied soils were slightly acidic (pH ranged from 4.82 to 6.8), dry at the time of sam pling (water content ranged between 1 to 10%) and low in total carbon (<0.6% w/w), 4 sampling (water content ranged between 1 to 10%) and low in total carbon (<0.6% w/w), nitrate (below detection limit) and water extractable ammonium (<1.8 mg/kg of dry 5 nitrate (below detection limit) and water extractable ammonium (<1.8 mg/kg of dry m atter basis (DMB)). In the PCO analysis, bottom site samples were well separated 6 matter basis (DMB)). In the PCO analysis, bottom site samples were well separated 0.001) and the middle 7 from the middle and top sites (one-way ANOSIM R=0.504, p=0.001) and the middle 0.001) (Figure 2.2). The vectors overlaid 8 and top sites were less separated (R=0.445, p=0.001) (Figure 2.2). The vectors overlaid on the PCO plot revealed the most notable differences in environmental conditions 9 on the PCO plot revealed the most notable differences in environmental conditions 2- ), total manganese (MnO), exchangeable copper 10 which included higher sulphate (SO4 ), total manganese (MnO), exchangeable copper

O) in the 11 (Cu), total iron (Fe2O3), lower total sodium (Na2O) and total potassium (K2O) in the bottom site compared to the middle and top sites samples (Figure 2.2). 12 bottom site compared to the middle and top sites samples (Figure 2.2).

13 14 Figure 2.2 Principal coordinate analysis (PCoA) based on Euclidean distance 15 calculated from normalized environmental parameters. PC O plots were performed using Euclidean distance matrix derived from normalised 16 PCO plots were performed using Euclidean distance matrix derived from normalised environmental parameters. The circle delineate parameters with strongest correlation 17 environmental parameters. The circle delineate parameters with strongest correlation w ith the reduced space (Pearson’s correlation >0.8) were overlaid on the PCO plot. 18 with the reduced space (Pearson’s correlation >0.8) were overlaid on the PCO plot.

29

C ECe denotes cation exchange capacity. 1 CECe denotes cation exchange capacity.

2 2.3.2 Soil microbial abundance and diversity

P yrosequencing of all 93 soil samples yielded a total 455,270 bacterial SSU rRNA gene 3 Pyrosequencing of all 93 soil samples yielded a total 455,270 bacterial SSU rRNA gene sequences and 374,996 fungal ITS gene sequences after read-quality filtering. 4 sequences and 374,996 fungal ITS gene sequences after read-quality filtering. C ombined, the entire 93 samples resulted in bacterial sequences clustering into 6928 5 Combined, the entire 93 samples resulted in bacterial sequences clustering into 6928 bacterial OTUs at 96% sequence identity. These OTUs were classified into 111 classes 6 bacterial OTUs at 96% sequence identity. These OTUs were classified into 111 classes w ithin 40 phyla. In contrast, fungal sequences clustered into 692 fungal OTUs and 7 within 40 phyla. In contrast, fungal sequences clustered into 692 fungal OTUs and consisted of 22 classes distributed within 5 phyla. A large number of fungal OTUs could 8 consisted of 22 classes distributed within 5 phyla. A large number of fungal OTUs could not be taxonomically assigned due to poor sequence alignment (<50% of nucleotides 9 not be taxonomically assigned due to poor sequence alignment (<50% of nucleotides aligned), which accounted for 13% of total filtered sequences. At lower taxonomic 10 aligned), which accounted for 13% of total filtered sequences. At lower taxonomic classification, 13, 24 and 50% of all bacterial SSU RNA gene sequences and 17, 18 and 11 classification, 13, 24 and 50% of all bacterial SSU RNA gene sequences and 17, 18 and 22% of all fungal ITS1 sequences could not be classified at the class, order and family 12 22% of all fungal ITS1 sequences could not be classified at the class, order and family level, respectively. 13 level, respectively. 14 W hen all 93 samples were combined, Chloroflexi, Actinobacteria, Proteobacteria, 15 When all 93 samples were combined, Chloroflexi, Actinobacteria, Proteobacteria, A cidobacteria and Candidate Divisions WPS-2 and AD3 dominated the bacterial 16 Acidobacteria and Candidate Divisions WPS-2 and AD3 dominated the bacterial com munities of Mitchell Peninsula soils (Figure 2.3). All of these phyla were presented 17 communities of Mitchell Peninsula soils (Figure 2.3). All of these phyla were presented at greater than 5% relative abundance, accounting for 90.8% of all bacterial sequences 18 at greater than 5% relative abundance, accounting for 90.8% of all bacterial sequences recovered. In addition, Gemmatimonadetes, , Candidate Division WS2, 19 recovered. In addition, Gemmatimonadetes, Armatimonadetes, Candidate Division WS2, P lanctomycetes and Bacteroidetes were also present, but at lower relative abundances, 20 Planctomycetes and Bacteroidetes were also present, but at lower relative abundances, representing only 7.3% of all SSU RNA gene sequences recovered. While a further 30 21 representing only 7.3% of all SSU RNA gene sequences recovered. While a further 30 phyla, all of lower relative abundance (< 1% each) including Cyanobacteria (0.36%), 22 phyla, all of lower relative abundance (< 1% each) including Cyanobacteria (0.36%), collectively they accounted for only 1.9% of the total sequences recovered. On average 23 collectively they accounted for only 1.9% of the total sequences recovered. On average the relative abundance of Candidate Division WPS-2 was 9%, with one sample from the 24 the relative abundance of Candidate Division WPS-2 was 9%, with one sample from the top site containing 25% WPS-2 (Figure 2.4). Moreover, the average relative abundance 25 top site containing 25% WPS-2 (Figure 2.4). Moreover, the average relative abundance of Candidate Division AD3 was 5%, with a maximum of 17% observed in a soil sample 26 of Candidate Division AD3 was 5%, with a maximum of 17% observed in a soil sample from the middle site. At class level, the seven most abundant classes, which were all 27 from the middle site. At class level, the seven most abundant classes, which were all present at relative abundances of 5% or more, accounted for 68.5% of the total number 28 present at relative abundances of 5% or more, accounted for 68.5% of the total number of bacterial sequences recovered; these classes included Thermoleophilia and 29 of bacterial sequences recovered; these classes included Thermoleophilia and A ctinobacteria from the phylum Actinobacteria; Ktedonobacteria, Thermomicrobia and 30 Actinobacteria from the phylum Actinobacteria; Ktedonobacteria, Thermomicrobia and C 0119 from the phylum Chloroflexi; Alphaproteobacteria from the phylum 31 C0119 from the phylum Chloroflexi; Alphaproteobacteria from the phylum 30

P roteobacteria; and Acidobacteria from the phylum Acidobacteria. 1 Proteobacteria; and Acidobacteria from the phylum Acidobacteria.

A WPS-2 Actinobacteria MP Chloroflexi Proteobacteria

Top WS2 Acidobacteria Gemmatimonadetes Middle AD3 Armatimonadetes Bacteroidetes Bottom Planctomycetes others 0% 20% 40% 60% 80% 100% 2 B MP WPS-2 Thermoleophilia

Top Ktedonobacteria C0119 Thermomicrobia Middle Actinobacteria Acidobacteriia Bottom Alphaproteobacteria others 0% 20% 40% 60% 80% 100% 3 4 Figure 2.3 Bacterial relative abundance at Mitchell Peninsula. T he commun ity composition of major (> 5% relative abundance) bacterial phyla (A) 5 The community composition of major (> 5% relative abundance) bacterial phyla (A) an d classes (B) at Mitchell Peninsula (MP). 6 and classes (B) at Mitchell Peninsula (MP). 7

31

1 2 Figure 2.4 The different distribution of major bacterial classes among the three 3 sample groups. A ll dominant bacterial classes at Mitchell Peninsula were identified at significantly 4 All dominant bacterial classes at Mitchell Peninsula were identified at significantly < 0.05) 5 different abundances across the sample groups. *: significant difference (t-test, p < 0.05) observed between bottom and higher elevation soils (mid and top soils); **: significant 6 observed between bottom and higher elevation soils (mid and top soils); **: significant < 0.05) observed among bottom, mid and top soils. 7 difference (t-test, p < 0.05) observed among bottom, mid and top soils. 8

32

W ith 93 soil samples combined, the fungal community was dominated by both 1 With 93 soil samples combined, the fungal community was dominated by both 2 Ascomycota and Basidiomycota at relative abundances of 77.1% and 9.7%, respectively (Figure 2.5). There was also a low proportion of sequences (0.01%) classified as 3 (Figure 2.5). There was also a low proportion of sequences (0.01%) classified as C hytridiomycota and 0.05% of sequences had no established taxonomic position and 4 Chytridiomycota and 0.05% of sequences had no established taxonomic position and . At lower taxonomic levels, more than 46% 5 thus were designated as Fungi incertae sedis. At lower taxonomic levels, more than 46% of fungal sequences were classified as Lecanoromycetes and these were comprised 6 of fungal sequences were classified as Lecanoromycetes and these were comprised predominantly of the orders Teloschistales (23%) and Lecanorales (17%). Additionally, 7 predominantly of the orders Teloschistales (23%) and Lecanorales (17%). Additionally, 10% of the fungal diversity belonged to the class Eurotiomycetes and consisted 8 10% of the fungal diversity belonged to the class Eurotiomycetes and consisted exclusively of Chaetothyriales. 9 exclusively of Chaetothyriales. 10 T he average bacterial SSU RNA gene abundance at Mitchell Peninsula was 3.78 (±2.08) 11 The average bacterial SSU RNA gene abundance at Mitchell Peninsula was 3.78 (±2.08) copies per gram of dry soil. Significantly lower abundance was identified at 12 x108 copies per gram of dry soil. Significantly lower abundance was identified at bottom site compared with soils collected from higher elevations (one-way ANOSIM 13 bottom site compared with soils collected from higher elevations (one-way ANOSIM 0.001) with no significant difference observed between the middle 14 global R=0.319 p=0.001) with no significant difference observed between the middle = 0.6) (Figure 2.7a). A similar trend was observed for the fungal 18S 15 and top sites (p=0.6) (Figure 2.7a). A similar trend was observed for the fungal 18S copies per gram 16 rRNA gene, where the average fungal abundance was 0.21(±0.20) x 107 copies per gram of dry soil, with 6-7 fold higher fungal abundance observed in middle and top sites 17 of dry soil, with 6-7 fold higher fungal abundance observed in middle and top sites sam ples compared to the bottom site samples (one-way ANOSIM, global R=0.342, 18 samples compared to the bottom site samples (one-way ANOSIM, global R=0.342, = 0.001) (Figure 2.7b). 19 p=0.001) (Figure 2.7b).

20 2.3.3 Microbial community similarity across Mitchell

21 Peninsula

A t the OTU level, Good’s Coverage of 93% for bacteria and 99% for fungi were 22 At the OTU level, Good’s Coverage of 93% for bacteria and 99% for fungi were estimated for all three sample groups at Mitchell Peninsula (Table 2.1). The bottom site 23 estimated for all three sample groups at Mitchell Peninsula (Table 2.1). The bottom site displayed lower, but not significantly different alpha diversity indices when compared to 24 displayed lower, but not significantly different alpha diversity indices when compared to 0.04, 0.06 and 0.13 for SOB, Chao1 and Shannon indices, 25 the other sites (t-test, p=0.04, 0.06 and 0.13 for SOB, Chao1 and Shannon indices, respectively). However, for fungi, there was a two-fold increase in alpha diversity 26 respectively). However, for fungi, there was a two-fold increase in alpha diversity indices calculated for the middle and top sites compared to bottom site soils (t-test, 27 indices calculated for the middle and top sites compared to bottom site soils (t-test, = 0.001). 28 p=0.001). 29 W e investigated whether the microbial community structure in soils from the three 30 We investigated whether the microbial community structure in soils from the three

33 sam ple groups varied at different levels of taxonomic resolution (i.e., phylum, class, 1 sample groups varied at different levels of taxonomic resolution (i.e., phylum, class, order and family level). For the bacterial community, the bottom site was significantly 2 order and family level). For the bacterial community, the bottom site was significantly different from the middle and top sites even at the phylum level (one-way ANOSIM 3 different from the middle and top sites even at the phylum level (one-way ANOSIM =0.001) (Figure 2.8a). The bottom site separated from the middle and 4 global R 0.435, p=0.001) (Figure 2.8a). The bottom site separated from the middle and top sites on the first axis of the PCO plot, which explained 35.7% of the variation. In 5 top sites on the first axis of the PCO plot, which explained 35.7% of the variation. In contrast, the middle and top site soils could not be fully separated, even with the 6 contrast, the middle and top site soils could not be fully separated, even with the exclusion of the bottom sites from the dataset. The most significant phyla (correlation > 7 exclusion of the bottom sites from the dataset. The most significant phyla (correlation > 0.6) attributing to the community variation were the lower abundances of Candidate 8 0.6) attributing to the community variation were the lower abundances of Candidate D ivisions WPS-2 and AD3, Acidobacteria and Proteobacteria and higher abundances of 9 Divisions WPS-2 and AD3, Acidobacteria and Proteobacteria and higher abundances of C hloroflexi, Bacteroidetes and Gemmatimonadetes observed in the soils from the 10 Chloroflexi, Bacteroidetes and Gemmatimonadetes observed in the soils from the bottom site (Figure 2.8a). At class level, significant difference in relative abundance 11 bottom site (Figure 2.8a). At class level, significant difference in relative abundance < 0.05) between bottom and higher elevation samples was observed in phyla 12 (t-test, p < 0.05) between bottom and higher elevation samples was observed in phyla W PS-2 and AD3 and all dominant classes (at least 5% relative abundance) (Figure 2.4). 13 WPS-2 and AD3 and all dominant classes (at least 5% relative abundance) (Figure 2.4). < 0.05) was 14 In classes Actinobacteria and Acidimicrobiia, significant difference (t-test, p < 0.05) was observed in all three sample groups. 15 observed in all three sample groups. 16

34

A MP

High Ascomycota Basidiomycota unclassified_Fungi Middle Fungi incertae sedis Chytridiomycota Low

0% 20% 40% 60% 80% 100% 1 B Dothideomycetes MP Agaricomycetes Leotiomycetes

High Lecanoromycetes

unclassified_Fungi

Middle Eurotiomycetes

Tremellomycetes Low unclassified_Ascomyc ota other 0% 20% 40% 60% 80% 100% 2 3 Figure 2.5 Fungal relative abundance at Mitchell Peninsula. T he comm unity composition of major (> 5% relative abundance) fungal phyla (A) and 4 The community composition of major (> 5% relative abundance) fungal phyla (A) and c lasses (B) at Mitchell Peninsula (MP). 5 classes (B) at Mitchell Peninsula (MP).

35

1 2 Figure 2.6 The different distribution of major fungal classes among the three 3 sample groups. E xcept Leotiomycetes, Microbotryomycetes, all dominant fungal orders at Mitchell 4 Except Leotiomycetes, Microbotryomycetes, all dominant fungal orders at Mitchell Peninsula were identified at significantly different abundance across the sample groups. 5 Peninsula were identified at significantly different abundance across the sample groups. < 0.05) 6 UNC_Ascomycota: unclassified Ascomycota; *: significant difference (t-test, p < 0.05) observed between bottom and higher elevation soils (mid and top soils);**: significant 7 observed between bottom and higher elevation soils (mid and top soils);**: significant < 0.05) observed between bottom and one of the higher elevation 8 difference (t-test, p < 0.05) observed between bottom and one of the higher elevation < 0.05) observed among 9 soils (mid or top soils); ***: significant difference (t-test, p < 0.05) observed among bottom, middle and top soils). 10 bottom, middle and top soils). 11

36

A B

1 2 Figure 2.7 Abundance of bacterial and fungal communities. T he bacterial abundances were measured by copy numbers of SSU RNA gene per gram 3 The bacterial abundances were measured by copy numbers of SSU RNA gene per gram of dry soil at three sample groups (A); the fungal abundances mere measured by copy 4 of dry soil at three sample groups (A); the fungal abundances mere measured by copy num bers of 18S rRNA gene per gram of dry soil at three sample groups (B). 5 numbers of 18S rRNA gene per gram of dry soil at three sample groups (B). 6

7 Table 2.1 Alpha diversity indices of bacterial and fungal diversity at Mitchell 8 Peninsula and different sampling groups. values are expressed as average (±standard derivation) 9 values are expressed as average (±standard derivation) Good’s Species Chao 1 Shannon coverage observed Bacteria Mitchell 0.93(±0.02) 634(±205) 1197(±422) 4.77(±0.36) overall Bottom 0.93(±0.02) 584 (±140.1) 1108(±306) 4.63(±0.44) Middle 0.93(±0.02) 671 (±167.0) 1248(±307) 4.88(±0.26) Top 0.93(±0.02) 645 (±272.2) 1232(±575) 4.80(±0.34) Fungi Mitchell 0.99(±0.01) 72.7 (±31.6) 89.1(±40.4) 2.40(±0.73) overall Bottom 1.00(±0.005) 42.2(±15.0) 51.9(±18.9) 1.63(±0.54) Middle 1.00(±0.004) 87.1(±29.9) 106.1(±40.8) 2.70(±0.52) Top 0.99(±0.006) 86.5(±24.4) 106.3(±31.3) 2.79(±0.47) 10

37

1 2 Figure 2.8 Principal coordinate analysis based on distance calculated from (A) bacterial SSU RNA gene OTU abundance matrix or 3 (B) fungal ITS gene abundance matrix. PC O plots were performed using Bray-Curtis similarity matrices derived from either square-root treated relative abundance matrix at 4 PCO plots were performed using Bray-Curtis similarity matrices derived from either square-root treated relative abundance matrix at phylum (bacteria) or family (fungi) level. Parameters with strong correlation (Pearson’s correlation > 0.6) were overlaid on the PCO plot. 5 phylum (bacteria) or family (fungi) level. Parameters with strong correlation (Pearson’s correlation > 0.6) were overlaid on the PCO plot.

38

Fu ngal community similarity could not be separated at phylum level on the fungal PCO 1 Fungal community similarity could not be separated at phylum level on the fungal PCO plot. However, they did separate at lower taxonomic resolutions including class, order 2 plot. However, they did separate at lower taxonomic resolutions including class, order and family (Figure 2.8b). At class level, significant difference between bottom and 3 and family (Figure 2.8b). At class level, significant difference between bottom and higher elevation samples was observed for most dominant classes (Figure 2.6). The 4 higher elevation samples was observed for most dominant classes (Figure 2.6). The lichenous fungal class Lecanoromycetes had significant decreases in relative 5 lichenous fungal class Lecanoromycetes had significant decreases in relative abundances at higher elevations and was probably replaced by Dothideomycetes and 6 abundances at higher elevations and was probably replaced by Dothideomycetes and E urotiomycetes. At the family level, the soils from the bottom site were well separated 7 Eurotiomycetes. At the family level, the soils from the bottom site were well separated from the middle and top site soils on the first ordination axis, which explained 33.2% of 8 from the middle and top site soils on the first ordination axis, which explained 33.2% of = 0.001). The separation at family 9 the variation (one-way ANOSIM global R=0.559, p=0.001). The separation at family level was also confirmed by presence/absence treatment of the dataset. In contrast to the 10 level was also confirmed by presence/absence treatment of the dataset. In contrast to the patterns observed for the bacterial community, the fungal communities from the middle 11 patterns observed for the bacterial community, the fungal communities from the middle and top sites separated on the second axis of ordination at family level (one-way 12 and top sites separated on the second axis of ordination at family level (one-way =0.001), which explained a further 8.2% of the variation (Figure 13 ANOSIM R=0.204 p=0.001), which explained a further 8.2% of the variation (Figure 2.8b). The transition of dominant fungal groups at each elevation groups was more 14 2.8b). The transition of dominant fungal groups at each elevation groups was more clearly observed at family level: Caliciaceae was only abundant in the bottom soils, its 15 clearly observed at family level: Caliciaceae was only abundant in the bottom soils, its relative abundance was lower, and the relative abundance of Lecanoraceae was higher 16 relative abundance was lower, and the relative abundance of Lecanoraceae was higher in the soils from the middle site. Lastly for soils from the top site, Cortinariaceae and 17 in the soils from the middle site. Lastly for soils from the top site, Cortinariaceae and H erpotrichiellaceae were dominant. 18 Herpotrichiellaceae were dominant. 19 T he separation of soils based on bacterial and fungal diversity closely resembled the 20 The separation of soils based on bacterial and fungal diversity closely resembled the 0.001 for 21 variations in measured environmental parameters (Figure 2.2) (Mantel test p=0.001 for both bacterial and fungal diversity). However, the separation of middle and top site soils 22 both bacterial and fungal diversity). However, the separation of middle and top site soils w as more pronounced in measured environmental parameters than was observed for the 23 was more pronounced in measured environmental parameters than was observed for the bacterial community (Figure 2.8a). DISTLM analysis identified that variation in the 24 bacterial community (Figure 2.8a). DISTLM analysis identified that variation in the bacterial community assemblage was best explained by longitude (21% of variability) 25 bacterial community assemblage was best explained by longitude (21% of variability) and total carbon (6% of variability) where another 20 environmental and geographical 26 and total carbon (6% of variability) where another 20 environmental and geographical parameters explained a further 28% of the total variation. Similarly, longitude was 27 parameters explained a further 28% of the total variation. Similarly, longitude was identified as the most important parameter explaining fungal community variations (20% 28 identified as the most important parameter explaining fungal community variations (20% of variation), followed by slope (additional 8% of variation), total carbon (additional 4% 29 of variation), followed by slope (additional 8% of variation), total carbon (additional 4% of variation) and 11 other parameters that together explained 46% of community 30 of variation) and 11 other parameters that together explained 46% of community variation (Appendix 2.1). 31 variation (Appendix 2.1). 32 33 39

1

2 2.3.4 Network analysis of bacterial and fungal communities

3 and measured environmental parameters

N etwork analysis was used to investigate the co-occurrence patterns of biotic (bacterial 4 Network analysis was used to investigate the co-occurrence patterns of biotic (bacterial and fungal OTUs) and measured abiotic variables (soil environmental parameters) and 5 and fungal OTUs) and measured abiotic variables (soil environmental parameters) and to predict potential ecological interactions between members of the community. To 6 to predict potential ecological interactions between members of the community. To address the significant difference in distributions of bacterial and fungal communities at 7 address the significant difference in distributions of bacterial and fungal communities at different elevation groups, only OTUs that were identified in more than 70% of samples 8 different elevation groups, only OTUs that were identified in more than 70% of samples w ithin any elevation group and had a relative abundance greater than 0.1% of that 9 within any elevation group and had a relative abundance greater than 0.1% of that elevation group, were included in the network analysis. 10 elevation group, were included in the network analysis. 11 To gether, eight sub-networks comprising 123 bacterial and fungal OTUs were observed, 12 Together, eight sub-networks comprising 123 bacterial and fungal OTUs were observed, w hich was dominated by OTUs belonging to Chloroflexi, Actinobacteria and 13 which was dominated by OTUs belonging to Chloroflexi, Actinobacteria and Proteobacteria (27, 22 and 15% of the nodes, respectively) (Figure 2.9). The OTUs 14 Proteobacteria (27, 22 and 15% of the nodes, respectively) (Figure 2.9). The OTUs presented in the network also represented 33.3 and 40.5% of fungal and bacterial total 15 presented in the network also represented 33.3 and 40.5% of fungal and bacterial total com munities (by relative abundance). All 280 significant correlations observed in 16 communities (by relative abundance). All 280 significant correlations observed in network were positive and the number of significant bacteria-to-bacteria correlations 17 network were positive and the number of significant bacteria-to-bacteria correlations (79% of all possible) outnumbered bacteria-to-fungi (17%), and fungi-to-fungi 18 (79% of all possible) outnumbered bacteria-to-fungi (17%), and fungi-to-fungi correlations (4%). However, this may be because of the higher number of bacterial 19 correlations (4%). However, this may be because of the higher number of bacterial O TUs present in the network. Bacterial OTUs belonging to Chloroflexi (OTU7, 115, 20 OTUs present in the network. Bacterial OTUs belonging to Chloroflexi (OTU7, 115, 168, 398, 581 598 and 685), Actinobacteria (OTU25 and 166), Acidobacteria (OTU26) 21 168, 398, 581 598 and 685), Actinobacteria (OTU25 and 166), Acidobacteria (OTU26) and fungal OTUs (OTU4234, 4366 and 4595) belonging to Lecanoromycetes had 22 and fungal OTUs (OTU4234, 4366 and 4595) belonging to Lecanoromycetes had higher number of correlations (degree, >10) than other OTUs. These high degree OTUs 23 higher number of correlations (degree, >10) than other OTUs. These high degree OTUs formed a complicated cluster with extensive inter-connections. 24 formed a complicated cluster with extensive inter-connections. 25

40

1 2 Figure 2.9 Network analysis of 80% cutoff based on OTU correlation analysis. -value <0.005), the size of each node is proportional to the 3 A connection stands for a very strong (Spearman’s p > 0.6) and significant (p-value <0.005), the size of each node is proportional to the ). 4 number of connections (degrees), and core OTUs IDs were labeled Bacteria (○), Fungi (△). 41

1 2.4 DISCUSSION

M itchell Peninsula hosts a unique microbial community comprised of an unusually high 2 Mitchell Peninsula hosts a unique microbial community comprised of an unusually high

dom inance of Candidate phyla bacteria WPS-2 and AD3 (Figure 2.2). To my knowledge, 3 dominance of Candidate phyla bacteria WPS-2 and AD3 (Figure 2.2). To my knowledge,

soils with such high abundance of both candidate phyla (i.e., WPS-2 and AD3) have not 4 soils with such high abundance of both candidate phyla (i.e., WPS-2 and AD3) have not

been reported and I believe this site represents a microbial dark matter hotspot. Beyond 5 been reported and I believe this site represents a microbial dark matter hotspot. Beyond

these Candidate phyla, many of the abundant taxa detected in Mitchell Peninsula 6 these Candidate phyla, many of the abundant taxa detected in Mitchell Peninsula

belonged to classes that are also poorly characterized. For example, bacterial classes 7 belonged to classes that are also poorly characterized. For example, bacterial classes

K tedonobacteria, Thermoleophilia, Thermomicrobia and Candidate class C0119, 8 Ktedonobacteria, Thermoleophilia, Thermomicrobia and Candidate class C0119,

accounted for average relative abundances of 12%, 15%, 11% and 7%, respectively 9 accounted for average relative abundances of 12%, 15%, 11% and 7%, respectively

(Figure 2.3). With less than 10 cultivated isolates from each of these classes described to 10 (Figure 2.3). With less than 10 cultivated isolates from each of these classes described to

date (Yokota 2012; Goodfellow et al., 2009), their ecological and biological capabilities 11 date (Yokota 2012; Goodfellow et al., 2009), their ecological and biological capabilities

are underexplored. However, it has been reported that, genetic material from dead 12 are underexplored. However, it has been reported that, genetic material from dead

m icroorganisms (bacteria and fungi) can impact DNA-based analyses microbial 13 microorganisms (bacteria and fungi) can impact DNA-based analyses microbial

diversity, and inflate the observed prokaryotic and fungal diversity by as high as 55% 14 diversity, and inflate the observed prokaryotic and fungal diversity by as high as 55%

(Carini et al., 2016), and the effect on Antarctic soil biodiversity investigation can be 15 (Carini et al., 2016), and the effect on Antarctic soil biodiversity investigation can be

m ore severe due to the low temperate of Antarctic that favours DNA preservation. 16 more severe due to the low temperate of Antarctic that favours DNA preservation.

T herefore, further studies on living and active microbial community using propidium 17 Therefore, further studies on living and active microbial community using propidium

m onoazide treated DNA (Carini et al., 2016) and reverse transcription PCR (Mills et al., 18 monoazide treated DNA (Carini et al., 2016) and reverse transcription PCR (Mills et al.,

2012) are required to further elucidate the active community structure of the microbial 19 2012) are required to further elucidate the active community structure of the microbial

com munity in Mitchell Peninsula. 20 community in Mitchell Peninsula.

21

In line with our observations of a high proportion of novel bacterial phyla being present 22 In line with our observations of a high proportion of novel bacterial phyla being present

in Mitchell Peninsula soils, there were a large number of ITS fungal sequences that 23 in Mitchell Peninsula soils, there were a large number of ITS fungal sequences that

could not be classified reliably within the Fungi . This was due to poor 24 could not be classified reliably within the Fungi kingdom. This was due to poor

sequence similarity to available ITS database sequences (<50% of nucleotides aligned) 25 sequence similarity to available ITS database sequences (<50% of nucleotides aligned)

and could be attributed to a limited number of Antarctic Fungal ITS sequences present 26 and could be attributed to a limited number of Antarctic Fungal ITS sequences present

in the database. 27 in the database. 42

W ater content has been considered as an important factor driving microbial abundance 1 Water content has been considered as an important factor driving microbial abundance and composition at Polar soils (Niederberger et al., 2015) and it was found to be a major 2 and composition at Polar soils (Niederberger et al., 2015) and it was found to be a major driver of bacterial community structures in this study (Appendix 2.1). Mitchell 3 driver of bacterial community structures in this study (Appendix 2.1). Mitchell

Peninsula soils are dry (average 4%) and poor in nutrients, therefore potentially limiting 4 Peninsula soils are dry (average 4%) and poor in nutrients, therefore potentially limiting the growth of microorganisms, but also selecting for bacterial groups that have adapted 5 the growth of microorganisms, but also selecting for bacterial groups that have adapted to these extreme conditions. A phenomenon of decreasing bacterial and fungal 6 to these extreme conditions. A phenomenon of decreasing bacterial and fungal abundance and diversity with increasing latitude has been proposed for Polar Regions 7 abundance and diversity with increasing latitude has been proposed for Polar Regions

(Yergeau et al., 2007). The bacterial abundance measured by soil bacterial SSU RNA 8 (Yergeau et al., 2007). The bacterial abundance measured by soil bacterial SSU RNA gene copy numbers at Mitchell Peninsula were 20 times lower than sub-Antarctic sites, 9 gene copy numbers at Mitchell Peninsula were 20 times lower than sub-Antarctic sites, such as Falkland Island or Signy Island (51 and 60º S) and approximately five times 10 such as Falkland Island or Signy Island (51 and 60º S) and approximately five times lower than Anchorage Island soils (67º S). The fungal abundance, measured by fungal 11 lower than Anchorage Island soils (67º S). The fungal abundance, measured by fungal

SS U RNA gene copy numbers, were similar to those detected at fell-field sites at 12 SSU RNA gene copy numbers, were similar to those detected at fell-field sites at

Falkland and Signy Islands, but only 50% of the level detected in Anchorage Island 13 Falkland and Signy Islands, but only 50% of the level detected in Anchorage Island

(Yergeau et al., 2007). Therefore, the bacterial and fungal abundance estimated in this 14 (Yergeau et al., 2007). Therefore, the bacterial and fungal abundance estimated in this study were consistent with this hypothesis. 15 study were consistent with this hypothesis.

16

T he multiple sub-networks observed potentially represent different environmental 17 The multiple sub-networks observed potentially represent different environmental niches occupied by various bacterial taxa, which reflects the conservation of critical 18 niches occupied by various bacterial taxa, which reflects the conservation of critical ecological traits, i.e., a requirement for similar environmental niche conditions for 19 ecological traits, i.e., a requirement for similar environmental niche conditions for grow th (Barberán et al., 2012). Lichenous fungi (OUT 4234 and 4595) formed 20 growth (Barberán et al., 2012). Lichenous fungi (OUT 4234 and 4595) formed extensive association with bacteria and this may reflect complex fungi-bacteria 21 extensive association with bacteria and this may reflect complex fungi-bacteria dynam ics such as between bacteria and fungi; the lichens providing shelter to 22 dynamics such as symbiosis between bacteria and fungi; the lichens providing shelter to the bacterial symbionts by increasing relative humidity, reducing ultraviolet exposure 23 the bacterial symbionts by increasing relative humidity, reducing ultraviolet exposure and temperature fluctuation (Tiao et al., 2012). 24 and temperature fluctuation (Tiao et al., 2012).

25

D ue to the low soil carbon level and low abundance of phototrophic Cyanobacteria 26 Due to the low soil carbon level and low abundance of phototrophic Cyanobacteria detected in Mitchell Peninsula, alternative carbon fixers within the community are 27 detected in Mitchell Peninsula, alternative carbon fixers within the community are

43 yet-to-be identified. In addition, a high abundance of lichenous fungi was detected here 1 yet-to-be identified. In addition, a high abundance of lichenous fungi was detected here w hile the algae or cyanobacterium symbiont was not identified. Within the bacterial 2 while the algae or cyanobacterium symbiont was not identified. Within the bacterial com munity recovered, potential phototrophic and chemolithoautotrophic carbon fixers 3 community recovered, potential phototrophic and chemolithoautotrophic carbon fixers are present. For example, Chloracidobacterium and Chloroflexi were identified at high 4 are present. For example, Chloracidobacterium and Chloroflexi were identified at high abundance and phototrophic carbon fixation capacity has been reported within these 5 abundance and phototrophic carbon fixation capacity has been reported within these phyla (Lacap et al., 2011, Garcia Costas et al., 2012). In addition, a high abundance of 6 phyla (Lacap et al., 2011, Garcia Costas et al., 2012). In addition, a high abundance of candidate phyla WPS-2 and AD3 were identified at Robinson Ridge and their ecological 7 candidate phyla WPS-2 and AD3 were identified at Robinson Ridge and their ecological role, the mechanism of surviving in this carbon and nitrogen limited, acidic 8 role, the mechanism of surviving in this carbon and nitrogen limited, acidic environment requires further investigation. Therefore, a functional analysis using 9 environment requires further investigation. Therefore, a functional analysis using shotgun metagenomics is required to resolve the carbon acquisition strategy at Mitchell 10 shotgun metagenomics is required to resolve the carbon acquisition strategy at Mitchell

Peninsula soils, allowing prediction of the genomic capacities of these candidate phyla. 11 Peninsula soils, allowing prediction of the genomic capacities of these candidate phyla.

12

44

1 Chapter 3 Atmospheric chemotrophy: A

2 unique functional capacity of East Antarctic

3 microbial communities

T he soil collection and physiochemical analysis were carried out by AAD, DNA 4 The soil collection and physiochemical analysis were carried out by AAD, DNA

extraction was performed by Josie van Dorst, shotgun library preparation, illumina data 5 extraction was performed by Josie van Dorst, shotgun library preparation, illumina data

processing and population genome binning and assessment were performed by Jason 6 processing and population genome binning and assessment were performed by Jason

Steen and Inka Vanwonterghem, genome annotation, metabolic pathway reconstruction, 7 Steen and Inka Vanwonterghem, genome annotation, metabolic pathway reconstruction,

com parative genome analysis, phylogeny analyses of RuBisCO and hydrogenases were 8 comparative genome analysis, phylogeny analyses of RuBisCO and hydrogenases were

done by the candidate. 9 done by the candidate.

10 3.1 Introduction

In Chapter 2, I demonstrated the Mitchell Peninsula in the Windmill Islands region, East 11 In Chapter 2, I demonstrated the Mitchell Peninsula in the Windmill Islands region, East

A ntarctica to be a microbial dark matter hotspot, featured by high abundances of 12 Antarctica to be a microbial dark matter hotspot, featured by high abundances of

candidate phyla WPS-2 and AD3, combined with low abundances of Cyanobacteria. 13 candidate phyla WPS-2 and AD3, combined with low abundances of Cyanobacteria.

C yanobacteria are frequently identified in temperate and Antarctic environments 14 Cyanobacteria are frequently identified in temperate and Antarctic environments

(M akhalanyane et al., 2015), and are often considered as crucial members of these 15 (Makhalanyane et al., 2015), and are often considered as crucial members of these

ecosystems, providing bio-available carbon and nitrogen to microbial community 16 ecosystems, providing bio-available carbon and nitrogen to microbial community

m embers (Makhalanyane et al., 2013). In contrast, the ecological function of the 17 members (Makhalanyane et al., 2013). In contrast, the ecological function of the

candidate phyla bacteria WPS-2 and AD3 are completely unknown and interactions of 18 candidate phyla bacteria WPS-2 and AD3 are completely unknown and interactions of

these rare bacteria between the community members demand further investigation. 19 these rare bacteria between the community members demand further investigation.

20

It is now well established that the phylogenetic diversity and potential functioning of 21 It is now well established that the phylogenetic diversity and potential functioning of

candidate divisions or microbial dark matter (MDM) is extremely high (Rinke et al., 22 candidate divisions or microbial dark matter (MDM) is extremely high (Rinke et al.,

2013, Hua et al., 2014). In recent years, the application of differential coverage binning 23 2013, Hua et al., 2014). In recent years, the application of differential coverage binning

45

of multiple genomes and single-cell genomics approaches have begun to bridge the gap 1 of multiple genomes and single-cell genomics approaches have begun to bridge the gap

providing novel insights into the candidate division bacteria spanning broad ecosystems 2 providing novel insights into the candidate division bacteria spanning broad ecosystems

such as aquifers, bioreactors, acid mine drainage systems, seawater and sediment (Tyson 3 such as aquifers, bioreactors, acid mine drainage systems, seawater and sediment (Tyson

et al., 2004, Albertsen et al., 2013, Rinke et al., 2013, Brown et al., 2015). However, the 4 et al., 2004, Albertsen et al., 2013, Rinke et al., 2013, Brown et al., 2015). However, the

recovery of draft genomes from soil has been problematic (Howe et al., 2014), with the 5 recovery of draft genomes from soil has been problematic (Howe et al., 2014), with the

exception of dry volcanic soils community that contained a relatively simple community 6 exception of dry volcanic soils community that contained a relatively simple community

structure dominated by over 97% of Actinobacteria (Lynch et al., 2014). 7 structure dominated by over 97% of Actinobacteria (Lynch et al., 2014).

T he aim of this chapter was to apply shotgun sequencing and differential coverage 8 The aim of this chapter was to apply shotgun sequencing and differential coverage

binning of multiple genomes to East Antarctic desert soils that host unique microbial 9 binning of multiple genomes to East Antarctic desert soils that host unique microbial

com munities, high in unknown taxa/representatives of bacterial candidate lineages. 10 communities, high in unknown taxa/representatives of bacterial candidate lineages.

T hrough the construction of draft genomes from novel bacterial phyla WPS-2, AD3, as 11 Through the construction of draft genomes from novel bacterial phyla WPS-2, AD3, as

w ell as other rare taxa that dominate both Mitchell Peninsula and Robinson Ridge 12 well as other rare taxa that dominate both Mitchell Peninsula and Robinson Ridge

(Siciliano et al., 2014, Ferrari et al., 2015). I aimed to reveal what carbon acquisition 13 (Siciliano et al., 2014, Ferrari et al., 2015). I aimed to reveal what carbon acquisition

strategies dominated this ecosystem. To do so will provide the opportunity to uncover 14 strategies dominated this ecosystem. To do so will provide the opportunity to uncover

not only the role of the candidate phyla bacteria are playing in this extreme environment, 15 not only the role of the candidate phyla bacteria are playing in this extreme environment,

but also the primary production strategy these microorganisms adopted to survive in the 16 but also the primary production strategy these microorganisms adopted to survive in the

extremely cold, harsh, nutrient-poor soil of eastern Antarctica. 17 extremely cold, harsh, nutrient-poor soil of eastern Antarctica.

18 3.2 Material and methods

19 3.2.1 Site description

D ue to the limitation on the samples from Mitchell Peninsula, DNA extracted from 20 Due to the limitation on the samples from Mitchell Peninsula, DNA extracted from

R obinson Ridge was used instead. Robinson Ridge is situated in the south of the 21 Robinson Ridge was used instead. Robinson Ridge is situated in the south of the

W indmill Islands region of eastern Antarctica, approximately 10 km south of Casey 22 Windmill Islands region of eastern Antarctica, approximately 10 km south of Casey

Station and five km south of Mitchell Peninsula. Mineral soils from Robinson Ridge 23 Station and five km south of Mitchell Peninsula. Mineral soils from Robinson Ridge

(-66.367739, 110.585262) were highly similar to Mitchell Peninsula, as both are 24 (-66.367739, 110.585262) were highly similar to Mitchell Peninsula, as both are

described as polar deserts devoid of vegetation and large animal life (Figure 3.1, Table 25 described as polar deserts devoid of vegetation and large animal life (Figure 3.1, Table

46

3.1), and both have been shown to contain unusually high abundances of candidate 1 3.1), and both have been shown to contain unusually high abundances of candidate

bacterial phyla WPS-2 and AD3 (Siciliano et al., 2014). 2 bacterial phyla WPS-2 and AD3 (Siciliano et al., 2014). 3 4 Table 3.1 Measured environmental parameters in Robinson Ridge surface soils. Sample ID 25913 25914 25919 Total carbon (% 0.13 0.14 0.24 w/w) *Total nitrogen 160 120 230 *Total phosphorus 1300 1200 1000 % Moisture 4 5 4 Conductivity 9.1 9.2 12.4 (uS/cm) at 25oC pH 5.4 5.9 5.1 *Cl 3.7 4.9 5.8

*NO2 <0.15 <0.15 <0.15 *Br <0.15 <0.15

*NO3 <0.76 <0.76 <0.76 *PO4 5 5.6 3.9 *SO4 2.1 1.9 2.5 *NH4 0.55 <0.55 0.73 *m g/kg dry mass basis 5 *mg/kg dry mass basis

47

1 2 Figure 3.1 Map of Robinson Ridge and Mitchell Peninsula in Windmill Islands,

3 East Antarctica.

4

48

1 3.2.2 Metagenome sequencing and assembly

D NA from Robinson Ridge soils were extracted in the same method described in 2.2.3. 2 DNA from Robinson Ridge soils were extracted in the same method described in 2.2.3.

M etagenome libraries were then prepared using the Nextera DNA Library Preparation 3 Metagenome libraries were then prepared using the Nextera DNA Library Preparation

K it (Illumina, CA, USA) and sequenced using three fifth of an Illumina HiSeq2000 4 Kit (Illumina, CA, USA) and sequenced using three fifth of an Illumina HiSeq2000

flowcell lane at the Institute for Molecular Biosciences (University of Queensland, 5 flowcell lane at the Institute for Molecular Biosciences (University of Queensland,

A ustralia). The raw reads (2 x 100bp reads, 14.9 Gb) were processed using 6 Australia). The raw reads (2 x 100bp reads, 14.9 Gb) were processed using

for adaptor removal and quality filtering and BBMap 7 Trimmomatic (Bolger et al., 2014) for adaptor removal and quality filtering and BBMap

) to merge overlapping reads. The processed 8 (https://sourceforge.net/projects/bbmap) to merge overlapping reads. The processed

assembly algorithm in 9 reads were combined into a large co-assembly using the de novo assembly algorithm in

C LC Genomics Workbench v8 (CLC Bio, Denmark) and gaps within scaffolds were 10 CLC Genomics Workbench v8 (CLC Bio, Denmark) and gaps within scaffolds were

. Processed reads from each dataset 11 closed using abyss-sealer (Simpson et al., 2009). Processed reads from each dataset

w ere mapped onto the co-assembly with BamM 12 were mapped onto the co-assembly with BamM

13 (https://github.com/minillinim/BamM.git).

14 3.2.3 Population genome binning and assessment

T he scaffolds were binned based on differential coverage profiles, kmer frequencies and 15 The scaffolds were binned based on differential coverage profiles, kmer frequencies and

G C content using GroopM (Imelfort et al., 2014) and MetaBAT (Kang et al., 2015). 16 GC content using GroopM (Imelfort et al., 2014) and MetaBAT (Kang et al., 2015).

Po pulation genome bins obtained with GroopM were further refined using the GroopM 17 Population genome bins obtained with GroopM were further refined using the GroopM

refine function. The genome completeness and contamination was estimated with 18 refine function. The genome completeness and contamination was estimated with

C heckM (Parks et al., 2015) by calculating the presence of lineage-specific single copy 19 CheckM (Parks et al., 2015) by calculating the presence of lineage-specific single copy

m arker genes. The two sets of population genomes from the GroopM and MetaBAT 20 marker genes. The two sets of population genomes from the GroopM and MetaBAT

binning were compared using RefineM (https://github.com/dparks1134/RefineM.git) to 21 binning were compared using RefineM (https://github.com/dparks1134/RefineM.git) to

identify possible duplicates and the best genomes (> 50% completeness, < 10% 22 identify possible duplicates and the best genomes (> 50% completeness, < 10%

contamination) were selected for further analysis. 23 contamination) were selected for further analysis.

24

49

1 3.2.4 Metabolic annotation

T he population genomes were annotated based on Prokka (Seemann 2014) and the 2 The population genomes were annotated based on Prokka (Seemann 2014) and the

K EGG Orthology database (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et 3 KEGG Orthology database (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et

al., 2000). Genes specifically involved in carbohydrate metabolism were identified 4 al., 2000). Genes specifically involved in carbohydrate metabolism were identified

using dbCAN (Yin et al., 2012), which is an HMM-based database for 5 using dbCAN (Yin et al., 2012), which is an HMM-based database for

carbohydrate-active (CAZy) (Lombard et al., 2014) annotation. Potential 6 carbohydrate-active enzyme (CAZy) (Lombard et al., 2014) annotation. Potential

secondary metabolite biosynthesis gene cluster was identified using antismash 7 secondary metabolite biosynthesis gene cluster was identified using antismash

(http://antismash.secondarymetabolites.org/) (Blin et al., 2013). 8 (http://antismash.secondarymetabolites.org/) (Blin et al., 2013).

9 3.2.5 Microbial community profile

A profile of the microbial community composition was generated with CommunityM 10 A profile of the microbial community composition was generated with CommunityM

) by mapping reads onto a database of 11 (https://github.com/dparks1134/CommunityM.git) by mapping reads onto a database of

bacterial, archaeal and eukaryotic 16S/18S rRNA genes, i.e. the SILVA (Quast et al., 12 bacterial, archaeal and eukaryotic 16S/18S rRNA genes, i.e. the SILVA (Quast et al.,

2013) and GreenGenes (DeSantis et al., 2006) databases clustered at 97% identity. 13 2013) and GreenGenes (DeSantis et al., 2006) databases clustered at 97% identity.

14 3.2.6 Phylogenetic inference

A genome tree was generated using 38 universal conserved marker genes (Darling et al., 15 A genome tree was generated using 38 universal conserved marker genes (Darling et al.,

2014) from 15,982 bacterial and archaeal genomes publically available from the 16 2014) from 15,982 bacterial and archaeal genomes publically available from the

Integrated Microbial Genomes database (IMG) (Markowitz et al., 2012) together with 17 Integrated Microbial Genomes database (IMG) (Markowitz et al., 2012) together with

the recovered population genomes. Marker genes were identified using HMMs and a 18 the recovered population genomes. Marker genes were identified using HMMs and a

concatenated alignment of the marker genes was used to generate the genome tree with 19 concatenated alignment of the marker genes was used to generate the genome tree with

FastTree (Soo et al., 2014). The genome tree was visualized and dereplicated manually 20 FastTree (Soo et al., 2014). The genome tree was visualized and dereplicated manually

to 4624 genomes in ARB. The dereplicated tree was uploaded in iTOL (Letunic et al., 21 to 4624 genomes in ARB. The dereplicated tree was uploaded in iTOL (Letunic et al.,

2011) to visualize as a circular tree and beautified in Adobe Illustrator. 22 2011) to visualize as a circular tree and beautified in Adobe Illustrator.

50

1 3.2.7 Comparative genomics analysis

T he Robinson Ridge metagenomes were compared with 17 metagenomes from four 2 The Robinson Ridge metagenomes were compared with 17 metagenomes from four

different ecosystems: volcanic soil, grassland, desert, Arctic soils. The metagenomes 3 different ecosystems: volcanic soil, grassland, desert, Arctic soils. The metagenomes

) and the quality 4 were obtained from MG-RAST server (http://metagenomics.anl.gov/) and the quality

filtering, gene prediction and annotation was performed using MG-RAST server. 5 filtering, gene prediction and annotation was performed using MG-RAST server.

H ierarchical cluster analysis was performed using PRIMER-E CLUSTER analysis 6 Hierarchical cluster analysis was performed using PRIMER-E CLUSTER analysis

based on similarity matrix derived from the relative abundance of proteins classified 7 based on similarity matrix derived from the relative abundance of proteins classified

using KEGG system and the significance was tested using Simprof test with 1000 8 using KEGG system and the significance was tested using Simprof test with 1000

permutations. The relative abundance of a given KEGG category was calculated by 9 permutations. The relative abundance of a given KEGG category was calculated by

dividing the number of genes classified under that category by the total number of genes 10 dividing the number of genes classified under that category by the total number of genes

w ith a KEGG annotation. 11 with a KEGG annotation.

12 3.2.8 Gene cluster and phylogenetic analysis of RubisCO,

13 hydrogenase and ammonia monooxygenase subunit C (amoC)

14 genes

T he assembled contigs that contained RubisCO, hydrogenases or ammonia 15 The assembled contigs that contained RubisCO, hydrogenases or ammonia

m onooxygenase subunit C (amoC) genes were extracted from the metagenome 16 monooxygenase subunit C (amoC) genes were extracted from the metagenome

(am monia monooxygenase subunit A was not recovered) and the contigs that contained 17 (ammonia monooxygenase subunit A was not recovered) and the contigs that contained

a full RubisCO operon were selected for gene cluster analysis. To determine the 18 a full RubisCO operon were selected for gene cluster analysis. To determine the

large and small subunit genes from Robinson Ridge 19 phylogeny of the RubisCO large and small subunit genes from Robinson Ridge

m etagenome, a reference phylogeny was constructed using available sequence data 20 metagenome, a reference phylogeny was constructed using available sequence data

from previous studies. Type II, III and IV RubisCO large subunit protein sequences 21 from previous studies. Type II, III and IV RubisCO large subunit protein sequences

RubisCO large and small 22 were obtained from Tabita et al., (2008), Form Ia, Ib and Ic RubisCO large and small

23 subunit protein sequences were obtained from Badger and Bek (2008), Type Id

R ubisCO large and small subunit protein sequences were translated from the gene 24 RubisCO large and small subunit protein sequences were translated from the gene

51

sequences reported by Kong et al (2012), with additional sequences were retrieved from 1 sequences reported by Kong et al (2012), with additional sequences were retrieved from

T ebo et al., (Tebo et al., 2015) and Park et al., (Park et al., 2009). For hydrogenases, 2 Tebo et al., (Tebo et al., 2015) and Park et al., (Park et al., 2009). For hydrogenases,

annotated reference sequences were obtained from Greening et al., (Greening et al., 3 annotated reference sequences were obtained from Greening et al., (Greening et al.,

2015a) and Vignais and Billoud (Vignais et al., 2007); and for amoC genes, annotated 4 2015a) and Vignais and Billoud (Vignais et al., 2007); and for amoC genes, annotated

reference sequences clustered at 90% were retrieved from Uniprot 5 reference sequences clustered at 90% were retrieved from Uniprot

). The Robinson Ridge protein sequences were aligned against 6 (http://www.uniprot.org/). The Robinson Ridge protein sequences were aligned against

reference sequences using ClustalX (Larkin et al., 2007) and then manually curated 7 reference sequences using ClustalX (Larkin et al., 2007) and then manually curated

using MEG A (Kumar et al., 2008). The phylogenetic tree was constructed using Fasttree 8 using MEGA (Kumar et al., 2008). The phylogenetic tree was constructed using Fasttree

using generalized time-reversible (GTR) models (Price et al., 2010) and the confidence 9 using generalized time-reversible (GTR) models (Price et al., 2010) and the confidence

level of the tree topology was evaluated by bootstrap analysis using 1,000 sequence 10 level of the tree topology was evaluated by bootstrap analysis using 1,000 sequence

replications. 11 replications.

12 3.3 Results

13 3.3.1 The Robinson Ridge metagenome

A metagenome comprised of 264,245,447 bp assembled from 157,031 contigs was 14 A metagenome comprised of 264,245,447 bp assembled from 157,031 contigs was

obtained (Table 3.2). In total, 451 prokaryotic and eukaryotic OTUs were identified 15 obtained (Table 3.2). In total, 451 prokaryotic and eukaryotic OTUs were identified

from 16S and 18S rRNA genes and the ecosystem was predominantly comprised of 16 from 16S and 18S rRNA genes and the ecosystem was predominantly comprised of

B acteria (97.8% relative abundance), followed by Eukaryota (1.3%) and Archaea (0.9%) 17 Bacteria (97.8% relative abundance), followed by Eukaryota (1.3%) and Archaea (0.9%)

(Figure 3.2). Of those classified prokaryotes, Thaumarchaeota (0.9%) was the only 18 (Figure 3.2). Of those classified prokaryotes, Thaumarchaeota (0.9%) was the only

A rchaeal taxon identified, while Actinobacteria dominated (46%) the prokaryotic 19 Archaeal taxon identified, while Actinobacteria dominated (46%) the prokaryotic

com munity. Within the Bacterial , Actinobacteria was followed by a high 20 community. Within the Bacterial domain, Actinobacteria was followed by a high

abundance of Chloroflexi (13.3%), Proteobacteria (9.8%), candidate division AD3 21 abundance of Chloroflexi (13.3%), Proteobacteria (9.8%), candidate division AD3

(6.3%), Acidobacteria (5.5%) and candidate division WPS-2 (5.2%). Remarkably, as 22 (6.3%), Acidobacteria (5.5%) and candidate division WPS-2 (5.2%). Remarkably, as

observed in Mitchell Peninsula, Cyanobacteria were only present at up to 0.3% of the 23 observed in Mitchell Peninsula, Cyanobacteria were only present at up to 0.3% of the

relative abundance of bacteria. Within the Eukaryota, Fungi were most dominant and 24 relative abundance of bacteria. Within the Eukaryota, Fungi were most dominant and

accounted 70.9% of eukaryotic community but this was equivalent to just 0.9% of the 25 accounted 70.9% of eukaryotic community but this was equivalent to just 0.9% of the

52

total microbial community. 1 total microbial community.

2

3 Table 3.2 Robinson Ridge metagenome sequence statistics. Robinson Ridge metagenome sequence count Proportion of total statistics in category Sequence reads (paired) 33,645,878 - Reads after quality filter (paired) 31562291 93.8% Number of contigs 157,031 Number of contigs (>= 1kb) 61,711 Longest contig 462,793 bp Shortest contig 500 bp Total genome size 264,245,447 bp GC content 62.89% N50 contig size 157,031 bp Coverage(25913) 2.89x Coverage(25914) 2.58x Coverage(25919) 3.18x Predicted genes (Prokka) 208,343 Genes with predicted functions 112,716 54% Genes assigned with KEGG ID 63,244 30% Genes with COGs 71,413 34% 4

5

53

1 2 Figure 3.2 Eukaryotic and prokaryotic taxonomy compositions of Robinson Ridge

3 soils.

R elative abundance of prokaryotic community within the metagenome via taxonomic 4 Relative abundance of prokaryotic community within the metagenome via taxonomic

assignment of recovered prokaryotic 16S rRNA gene reads (A) and relative abundance 5 assignment of recovered prokaryotic 16S rRNA gene reads (A) and relative abundance

of eukaryotic community within the metagenome via taxonomic assignment of 6 of eukaryotic community within the metagenome via taxonomic assignment of

recovered Eukaryotic 18S rRNA gene reads (B). 7 recovered Eukaryotic 18S rRNA gene reads (B).

8

54

1 3.3.2 Recovery of draft genomes from Robinson Ridge soil

U sing differential coverage binning, we successfully separated 23 prokaryotic draft 2 Using differential coverage binning, we successfully separated 23 prokaryotic draft

genom es from the Robinson Ridge metagenome. Of these, 22 were classified as bacteria 3 genomes from the Robinson Ridge metagenome. Of these, 22 were classified as bacteria

and one was classified as an archaeon (Table 3.3). These recovered draft genomes were 4 and one was classified as an archaeon (Table 3.3). These recovered draft genomes were

abundant within at least one of the three samples analysed (relative abundance >1%) 5 abundant within at least one of the three samples analysed (relative abundance >1%)

(Table 3.3). The completeness of each genome was estimated using the presence of 6 (Table 3.3). The completeness of each genome was estimated using the presence of

conserved single-copy marker genes, which ranged between 55.8 and 99.1%, with a 7 conserved single-copy marker genes, which ranged between 55.8 and 99.1%, with a

higher percentage of genome contamination (>5%) observed in the Actinobacterial draft 8 higher percentage of genome contamination (>5%) observed in the Actinobacterial draft

genom es (bin 1,4,5,9 and 11). Within the twenty-two bacterial draft genomes recovered, 9 genomes (bin 1,4,5,9 and 11). Within the twenty-two bacterial draft genomes recovered,

eleven were classified as Actinobacteria (bin 1 to 11), two as Verricumicrobia (bin 20 10 eleven were classified as Actinobacteria (bin 1 to 11), two as Verricumicrobia (bin 20

and 21), three as Chloroflexi (bin 16, 17 and 18) and one as Proteobacteria (bin 15). The 11 and 21), three as Chloroflexi (bin 16, 17 and 18) and one as Proteobacteria (bin 15). The

W PS-2 and AD3 draft genomes each formed a separate cluster (bin 12 to 14 and bin 22 12 WPS-2 and AD3 draft genomes each formed a separate cluster (bin 12 to 14 and bin 22

and 23, Figure 3.3) and the taxonomy was determined based on 16S rRNA gene 13 and 23, Figure 3.3) and the taxonomy was determined based on 16S rRNA gene

sequences against the Greengenes database. Phylogenetic analysis demonstrated the 14 sequences against the Greengenes database. Phylogenetic analysis demonstrated the

closest lineages to AD3 and WPS-2 appears to be Chloroflexi and Armatimonadetes, 15 closest lineages to AD3 and WPS-2 appears to be Chloroflexi and Armatimonadetes,

respectively. 16 respectively.

17

T he two candidate division WPS-2 draft genomes recovered was similar in genome size 18 The two candidate division WPS-2 draft genomes recovered was similar in genome size

and GC content (Table 3.3). In comparison, there were three draft genomes recovered 19 and GC content (Table 3.3). In comparison, there were three draft genomes recovered

that were classified as candidate division AD3, two bins (12 and 13) had similar 20 that were classified as candidate division AD3, two bins (12 and 13) had similar

genom es sizes (3.0 Mb) while the third bin (14) was much larger, at least 5.3 Mb. Draft 21 genomes sizes (3.0 Mb) while the third bin (14) was much larger, at least 5.3 Mb. Draft

genom e bin 14 had a higher contamination rate and a doubled genome size could 22 genome bin 14 had a higher contamination rate and a doubled genome size could

indicate multiple genomes that were not separated. 23 indicate multiple genomes that were not separated.

24

55

1 2 Figure 3.3 Phylogenetic tree demonstrating the phylogenetic relationship between

3 recovered candidate division bacteria WPS-2 and AD3 inferred from multiple

4 marker genes.

Phylogeney was inferred from 38 universal conserved marker genes from 15,982 5 Phylogeney was inferred from 38 universal conserved marker genes from 15,982

bacterial and archaeal genomes publically available from the Integrated Microbial 6 bacterial and archaeal genomes publically available from the Integrated Microbial

G enomes database (IMG), candidate division AD3 appeared to be most closely related 7 Genomes database (IMG), candidate division AD3 appeared to be most closely related

to Chloroflexi, while candidate division WPS-2 appeared to be closest related to 8 to Chloroflexi, while candidate division WPS-2 appeared to be closest related to

A rmatimonadetes. 9 Armatimonadetes.

10

56

1 Table 3.3 Summary of near-complete draft genomes recovered from Robinson Ridge metagenome. 2 Complete: completeness; conta: (contamination); Rel. abund range: relative abundance range; Dep: average sequencing depth Population Phylogeny 16S Complete Conta (%) Rel. Size Dep Contigs N50 GC Annotated genome ID (%) abund (Mb) (#) (bp) (%) Genes (#) range (%) bin1 p_Actinobacteria; + 93.1 5.6 0.5 - 2.3 13x 247 13870 68.3 2560 o_Acidimicrobiales 2.1 bin2 p_Actinobacteria + 94.1 1.1 1.1 - 2.4 22x 91 39959 59.6 2624 3.0 bin3 p_Actinobacteria; + 84.9 2.1 0.3 - 3.0 25x 93 49695 70.2 3098 o_Pseudonocardiales 4.8 bin4 p_Actinobacteria; 83.5 5.3 0.5 - 3.7 15x 228 24463 69.1 3900 o_Pseudonocardiales 4.7 bin5 p_Actinobacteria; 81.0 6.6 1.3 - 4.6 14x 462 14183 68.4 5070 o_Pseudonocardiales 4.1 bin6 p_Actinobacteria; 77.9 3.1 0.0 - 3.9 13x 437 10907 70.5 4295 o_Pseudonocardiales 3.2 bin7 p_Actinobacteria; 91.8 3.0 8.8 - 4.6 60x 379 20402 67.4 5155 o_Pseudonocardiales 10.0 bin8 p_Actinobacteria; 89.5 3.4 0.2 - 5.4 26x 608 15960 67.5 5966 o_Pseudonocardiales 12.5 bin9 p_Actinobacteria; 67.5 9.8 0.6 - 3.7 15x 747 6168 68.2 4526 o_Pseudonocardiales 1.7 bin10 p_Actinobacteria; + 95.3 0.9 2.1 - 2.7 42x 100 52627 69.6 2729 o_Solirubrobacterales 3.6 bin11 p_Actinobacteria; 87.3 6.7 1.7 - 3.0 32x 307 16457 70.1 3358 o_Solirubrobacterales 3.0 bin12 p_AD3 + 95.3 2.3 0.3 - 3.0 17x 302 23210 69.2 3231

57

3.0 in13 p_AD3 + 96.3 0.0 1.7 - 3.0 34x 71 62854 66.7 3135 4.9 bin14 p_AD3 + 92.4 4.6 1.9 - 5.3 14x 358 25215 68.3 5615 4.0 bin15 p_Proteobacteria; 90.0 1.4 1.7 - 3.6 24x 429 10900 59.8 4096 o_Rhizobiales 2.8 bin16 p_Chloroflexi 55.8 0.0 1.3 - 2.7 35x 523 5651 60.5 2592 5.2 bin17 p_Chloroflexi 84.8 5.0 0.3 - 3.7 12x 440 12432 61.6 4203 2.0 bin18 p_Gitt-GS-136 + 91.7 0.5 1.0 - 2.2 28x 48 98473 69.6 2218 2.6 bin19 p_Thaumarchaeota + 98.1 3.9 1.6 - 4.0 21x 278 22997 38.9 5405 2.6 bin20 p_Verrucomicrobia; + 92.9 5.1 0.7 - 3.6 16x 173 31661 58.4 3513 o_Chthoniobacterales 1.8 bin21 p_Verrucomicrobia; + 66.0 2.3 0.0 - 2.5 11x 422 6693 58.7 2825 o_Chthoniobacterales 1.7 bin22 p_WPS-2 + 99.1 0.9 1.9 - 2.6 25x 19 214289 62.6 2529 2.6 bin23 p_WPS-2 + 94.9 0.0 0.3 - 1.9 27x 43 74679 60.1 1988 2.7 1 2

58

1 3.3.3 Energy and nutrient dynamics inferred from the

2 Robinson Ridge metagenome

In total, 208,343 genes were predicted from the metagenome with approximately 54% 3 In total, 208,343 genes were predicted from the metagenome with approximately 54% of the genes found to have a predictable function (Table 3.2). Based on the KEGG 4 of the genes found to have a predictable function (Table 3.2). Based on the KEGG classification system, the most abundant metabolic categories identified within the 5 classification system, the most abundant metabolic categories identified within the com plete metagenome were carbohydrate (18%), amino acid (14%) and energy 6 complete metagenome were carbohydrate (18%), amino acid (14%) and energy m etabolism (8%) (Appendix 3.1). The Robinson Ridge metagenomes were then 7 metabolism (8%) (Appendix 3.1). The Robinson Ridge metagenomes were then com pared with 17 publicly available metagenomes from grassland, desert, Arctic tundra 8 compared with 17 publicly available metagenomes from grassland, desert, Arctic tundra and volcanic soils and hierarchical clustering grouped the Robinson Ridge soil samples 9 and volcanic soils and hierarchical clustering grouped the Robinson Ridge soil samples in a separate cluster, but were most closely related to volcanic soil and desert soils 10 in a separate cluster, but were most closely related to volcanic soil and desert soils (Appendix 3.2). By comparing the relative abundance of KEGG categories, genes 11 (Appendix 3.2). By comparing the relative abundance of KEGG categories, genes associated with carbohydrate, lipid, polyketide, other amino acids and xenobiotics 12 associated with carbohydrate, lipid, polyketide, other amino acids and xenobiotics < 0.05 ) in Robinson Ridge 13 metabolism were significantly over-represented (t-test, p < 0.05 ) in Robinson Ridge com pared with the four different ecosystems examined (Figure 3.4). 14 compared with the four different ecosystems examined (Figure 3.4). 15

59

1 2 Figure 3.4 comparison of relative abundance of KO gene categories between 3 Robinson Ridge and other soil ecosystems. D endrogram showed hierarchical distance between average relative abundance of major 4 Dendrogram showed hierarchical distance between average relative abundance of major (>1% relative abundance) KEGG categories. Categories labeled with an asterisk were 5 (>1% relative abundance) KEGG categories. Categories labeled with an asterisk were <0.05). 6 categories over-represented in the Robinson Ridge metagenomes (t-test, p <0.05). 7

60

1 3.3.3.1 Organic carbon utilization and biosynthesis

To evaluate the environmental carbon acquisition capacity of the microbes in Robinson 2 To evaluate the environmental carbon acquisition capacity of the microbes in Robinson R idge soils, key enzymes for organic carbon degradation were identified (Appendix 3.3). 3 Ridge soils, key enzymes for organic carbon degradation were identified (Appendix 3.3). G enes for the starch hydrolysis enzymes phosphorylase, alpha amylase and 4 Genes for the starch hydrolysis enzymes phosphorylase, alpha amylase and glucoamylase were widely identified within the draft genomes in bacterial community 5 glucoamylase were widely identified within the draft genomes in bacterial community (K00688, K01176, K05343, K01178, K01196, K00705, K12047, K01187 and K15922, 6 (K00688, K01176, K05343, K01178, K01196, K00705, K12047, K01187 and K15922, A ppendix 3.3). Only Actinobacteria, Chloroflexi and could potentially 7 Appendix 3.3). Only Actinobacteria, Chloroflexi and Verrucomicrobia could potentially degrade more structurally complicated polysaccharides such as chitin, pectin and 8 degrade more structurally complicated polysaccharides such as chitin, pectin and cellulose (K01183, K01051, K01179, K05349 and K05350, respectively). Putative 9 cellulose (K01183, K01051, K01179, K05349 and K05350, respectively). Putative chitinase genes were identified in one Actinobacterial genome (bin 2), while pectin and 10 chitinase genes were identified in one Actinobacterial genome (bin 2), while pectin and xylan degradation capability was present in Verricumicrobia (bin 21) and Chloroflexi 11 xylan degradation capability was present in Verricumicrobia (bin 21) and Chloroflexi (bin 18), respectively. Putative genes for endoglucanase, which is involved in cellulose 12 (bin 18), respectively. Putative genes for endoglucanase, which is involved in cellulose degradation, were predominately identified in the Actinobacterial community (bins 5, 13 degradation, were predominately identified in the Actinobacterial community (bins 5, 10, 11). However, the utilisation of partially degraded cellulose (cellobiose) was not be 14 10, 11). However, the utilisation of partially degraded cellulose (cellobiose) was not be limited to Actinobacteria, as a range of different bacterial and archaeal draft genomes 15 limited to Actinobacteria, as a range of different bacterial and archaeal draft genomes carried putative genes encoding two different beta-glucosidase that is capable of 16 carried putative genes encoding two different beta-glucosidase that is capable of hydrolysing cellobiose (K05349 and K05350, Appendix 3.3). Genes for trehalose 17 hydrolysing cellobiose (K05349 and K05350, Appendix 3.3). Genes for trehalose biosynthesis were widely observed in the bacterial genomes, except for WPS-2 (bin 22 18 biosynthesis were widely observed in the bacterial genomes, except for WPS-2 (bin 22 and 23) and the Thaumarchaeota genome (bin 19) (K00697, K16055, K01087, K05343, 19 and 23) and the Thaumarchaeota genome (bin 19) (K00697, K16055, K01087, K05343, K 02438, K06044, K01236 and K13057, Appendix 3.3). All the remaining draft 20 K02438, K06044, K01236 and K13057, Appendix 3.3). All the remaining draft genom es carried genes for the biosynthesise of trehalose from starch and frequently 21 genomes carried genes for the biosynthesise of trehalose from starch and frequently m ore than one trehalose biosynthesis pathway was identified within each of the draft 22 more than one trehalose biosynthesis pathway was identified within each of the draft genom es analysed. 23 genomes analysed.

24 3.3.3.2 Secondary metabolites biosynthesis capacity

A ntismash identified 23 potential secondary metabolite biosynthesis gene clusters. Of 25 Antismash identified 23 potential secondary metabolite biosynthesis gene clusters. Of these, more gene clusters for terpene (n=7) were identified, followed by bacteriocin 26 these, more gene clusters for terpene (n=7) were identified, followed by bacteriocin (n=3), nonribosomal peptide synthetase (NRPS, n=3) and other type of polyketide 27 (n=3), nonribosomal peptide synthetase (NRPS, n=3) and other type of polyketide synthases (PKS, n=3) (Table 3.4). There was no type I polyketide synthase identified, 28 synthases (PKS, n=3) (Table 3.4). There was no type I polyketide synthase identified, w hile both type II and III PKS were predicted. Terpene and bacteriocin biosynthesis 29 while both type II and III PKS were predicted. Terpene and bacteriocin biosynthesis gene clusters were identified in the recovered draft genomes, terpene gene clusters were 30 gene clusters were identified in the recovered draft genomes, terpene gene clusters were

61

identified in draft genomes of candidate division AD3, Actinobacteria and 1 identified in draft genomes of candidate division AD3, Actinobacteria and Proteobacteria, while bacteriocin gene clusters were only identified in Proteobacteria 2 Proteobacteria, while bacteriocin gene clusters were only identified in Proteobacteria and Actinobacteria. The remaining gene clusters identified from unbinned contigs were 3 and Actinobacteria. The remaining gene clusters identified from unbinned contigs were m ost similar to Actinobacteria sequences based on BLAST searches. 4 most similar to Actinobacteria sequences based on BLAST searches.

5 3.3.3.3 Carbon fixation

O nly the bacterial CBB cycle and archaeal 3-hydroxypropionate cycle (Zhalnina et al., 6 Only the bacterial CBB cycle and archaeal 3-hydroxypropionate cycle (Zhalnina et al., 2014) were identified within the Robinson Ridge metagenome (Appendix 3.4 and 3.6). 7 2014) were identified within the Robinson Ridge metagenome (Appendix 3.4 and 3.6). O n average, 0.04% of the total predicted genes within the metagenome encoded for the 8 On average, 0.04% of the total predicted genes within the metagenome encoded for the putative RubisCO large subunit and a high sequence diversity of RubisCO genes were 9 putative RubisCO large subunit and a high sequence diversity of RubisCO genes were identified for both the large (n=22) and small (n=21) subunit genes (Appendix 3.3). 10 identified for both the large (n=22) and small (n=21) subunit genes (Appendix 3.3). H owever, some of the large subunit genes fragments were too short to contain enough 11 However, some of the large subunit genes fragments were too short to contain enough information to infer phylogeny accurately. Of the 20 RuBisCO large subunit sequences 12 information to infer phylogeny accurately. Of the 20 RuBisCO large subunit sequences that could be placed on the phylogenetic tree reliably (Figure 3.6), two Robinson Ridge 13 that could be placed on the phylogenetic tree reliably (Figure 3.6), two Robinson Ridge sequences clustered loosely with the “green type” RubisCO, while the remaining of 14 sequences clustered loosely with the “green type” RubisCO, while the remaining of R ubisCO sequences were affiliated with the “red type”, clustering with known 15 RubisCO sequences were affiliated with the “red type”, clustering with known A ctinobacteria sequences of subtype IE as well as the novel RuBisCO large subunit 16 Actinobacteria sequences of subtype IE as well as the novel RuBisCO large subunit genes sequences recently identified in Tebo et al., (Tebo et al., 2015). The putative 17 genes sequences recently identified in Tebo et al., (Tebo et al., 2015). The putative R uBisCO genes identified in the metagenome were predominantly distributed within 18 RuBisCO genes identified in the metagenome were predominantly distributed within the Actinobacteria but most interestingly, both the novel RuBisCO large and small 19 the Actinobacteria but most interestingly, both the novel RuBisCO large and small subunits genes were also present within WPS-2 (bin 22) and the AD3 draft genomes 20 subunits genes were also present within WPS-2 (bin 22) and the AD3 draft genomes (bin 11 and 12) (K01601 and K01602, Appendix 3.3). In addition to this key enzyme for 21 (bin 11 and 12) (K01601 and K01602, Appendix 3.3). In addition to this key enzyme for the CBB cycle, carbonic anhydrase (EC 4.2.1.1), which is involved in inorganic carbon 22 the CBB cycle, carbonic anhydrase (EC 4.2.1.1), which is involved in inorganic carbon

) uptake (K01603, Appendix 3.3) and the RuBisCO activating enzyme (cbbX, for 23 (CO2) uptake (K01603, Appendix 3.3) and the RuBisCO activating enzyme (cbbX, for the transcription of RuBisCO genes) were also identified in the candidate division 24 the transcription of RuBisCO genes) were also identified in the candidate division W PS-2 and AD3 genomes (Figure 3.5).25 WPS-2 and AD3 genomes (Figure 3.5).25

62

1 Table 3.4. Potential secondary metabolites biosynthesis gene cluster identified by antismash. T axonomic origin within parentheses were inferred from the most similar known clusters. : nonribosomal peptide synthetise. 2 Taxonomic origin within parentheses were inferred from the most similar known clusters. : nonribosomal peptide synthetise. Origin Type Most similar known cluster Bin 15 Bacteriocin Oxazolomycin_biosynthetic_gene_cluster (6% of genes show Proteobacteria similarity) Bin 15 Bacteriocin - Proteobacteria Bin 10 Bacteriocin - Actinobacteria (Proteobacteria) Butyrolactone - (Actinobacteria) Lantipeptide Catenulipeptin_biosynthetic_gene_cluster (60% of genes show similarity) (Actinobacteria) Lantipeptide - (Actinobacteria) NRPS - (Actinobacteria) NRPS Mannopeptimycin_biosynthetic_gene_cluster (7% of genes show similarity) (Actinobacteria) NRPS - (Actinobacteria) Other polyketide synthase Eicosapentaenoic_acid_biosynthetic_gene_cluster (22% of genes show similarity) (Actinobacteria) Other polyketide synthase - (Actinobacteria) Other polyketide synthase - (Actinobacteria) Asukamycin_biosynthetic_gene_cluster (4% of genes show similarity) (Actinobacteria) Siderophore - (Actinobacteria) T2 polyketide synthase BE-7585A_biosynthetic_gene_cluster (23% of genes show similarity) (Actinobacteria) T3 polyketide synthase s Alkyl-O-Dihydrogeranyl-Methoxyhydroquinones_biosynthetic_gene (71% of genes show similarity) Bin 13 (AD3) Terpene -

63

Bin 9 Terpene - Actinobacteria) Bin 3 Terpene Hopene_biosynthetic_gene_cluster (46% of genes show similarity) Actinobacteria) Bin 3 Terpene Isorenieratene_biosynthetic_gene_cluster (28% of genes show Actinobacteria) similarity) Bin Terpene - 15(Proteobacteria) (Actinobacteria) Terpene - (Actinobacteria) Terpene Geosmin_biosynthetic_gene_cluster (100% of genes show similarity) 1

64

A

1 B

2 3 Figure 3.5 The gene arrangement structures for RuBisCO gene cluster predicted draft genomes. (A) Bacterial Candidate Division WPS-2 draft genome bin 22 and (B) bacterial Candidate Division AD3 draft genome bin 12. 4 (A) Bacterial Candidate Division WPS-2 draft genome bin 22 and (B) bacterial Candidate Division AD3 draft genome bin 12. : 5 The assembled contigs where RuBisCO large and small subunit genes identified were shown. cbbX: RbisCO expression protein ; rbcS: : transcriptional 6 RuBisCO small subuni; rbcL: RuBisCO large subunit; cbbP: phosphoribulokinase protein; cbbT: transketolase; cbbR: transcriptional regulator protein. 7 regulator protein. 8 9

65

1 2 Figure 3.6 Phylogenetic structure of type IE RuBisCO large subunit amino acid 3 sequences identified from the Robinson Ridge. N odes labelled with ● denotes a boottrap value greater than 0.75 after 1000 replications 4 Nodes labelled with ● denotes a boottrap value greater than 0.75 after 1000 replications and type II and III RuBisCO sequences were included as outgroup. Robinson Ridge 5 and type II and III RuBisCO sequences were included as outgroup. Robinson Ridge R uBisCO large gene sequences are identified with “RR” and binned sequences were 6 RuBisCO large gene sequences are identified with “RR” and binned sequences were and 7 coloured red; “Novel” RuBisCO sequences from Tebo et al., were coloured blue and ”; selected reference sequences from Park et al., and Actinobacteria 8 identified with “AID”; selected reference sequences from Park et al., and Actinobacteria w ere included as reference. 9 were included as reference. 10

66

1 3.3.3.4 Nitrogen metabolism

C omparing with other ecosystems, Robinson Ridge had a significantly lower relative 2 Comparing with other ecosystems, Robinson Ridge had a significantly lower relative < 0.05) in all nitrogen cycling related genes (Appendix 3.3). 3 abundance (t-test, p < 0.05) in all nitrogen cycling related genes (Appendix 3.3). C omplete pathways for denitrification, assimilatory and dissimilatory nitrate reduction 4 Complete pathways for denitrification, assimilatory and dissimilatory nitrate reduction w ere identified within the Robinson Ridge metagenome, but the nitrification pathway 5 were identified within the Robinson Ridge metagenome, but the nitrification pathway w as incomplete and the nitrogen fixation pathway was not detected (Appendix 3.6). 6 was incomplete and the nitrogen fixation pathway was not detected (Appendix 3.6). Putative assimilatory nitrogen reduction was identified from draft genomes of AD3 (bin 7 Putative assimilatory nitrogen reduction was identified from draft genomes of AD3 (bin 12), Actinobacteria (bin 6 and 11) and Verrucomicrobia (bin 20) (K00366, K00367 and 8 12), Actinobacteria (bin 6 and 11) and Verrucomicrobia (bin 20) (K00366, K00367 and K 00372, Appendix 3.3), while dissimilatory nitrate reduction was predominated 9 K00372, Appendix 3.3), while dissimilatory nitrate reduction was predominated identified in Actinobacteria (bin 6, 7, 8 and 11), Proteobacteria (bin 15) and 10 identified in Actinobacteria (bin 6, 7, 8 and 11), Proteobacteria (bin 15) and V errucromicrobia (bin 20) (K00362, Appendix 3.3). In contrast, the denitrification 11 Verrucromicrobia (bin 20) (K00362, Appendix 3.3). In contrast, the denitrification pathway was more abundant, present in bacteria and archaea, WPS-2 (bin 22), 12 pathway was more abundant, present in bacteria and archaea, WPS-2 (bin 22), T haumarchaeota (bin 19), Proteobacteria (bin 15), Actinobacteria (bin 1, 5, 6, 7, 8 and 13 Thaumarchaeota (bin 19), Proteobacteria (bin 15), Actinobacteria (bin 1, 5, 6, 7, 8 and 11) and Verrucomicrobia (bin 20 and 21) (K00368, K00370 and K00376, Appendix 3.3). 14 11) and Verrucomicrobia (bin 20 and 21) (K00368, K00370 and K00376, Appendix 3.3). genes was identified from the metagenome and belonged to 15 Only archaeal amoC genes was identified from the metagenome and belonged to T haumarchaeota (bin 19) (K10944 and K10945, Appendix 3.3). 16 Thaumarchaeota (bin 19) (K10944 and K10945, Appendix 3.3).

17 3.3.3.5 Energy metabolism

Putative cytochrome C type oxidases were identified in all 23 draft genomes, suggesting 18 Putative cytochrome C type oxidases were identified in all 23 draft genomes, suggesting all bins recovered in this study can tolerate oxygen and are likely to have aerobic 19 all bins recovered in this study can tolerate oxygen and are likely to have aerobic respiration capacities. Both genes for putative soluble and particulate methane 20 respiration capacities. Both genes for putative soluble and particulate methane , respectively) were identified within the metagenome. 21 monooxygenase (mmo and pmo, respectively) were identified within the metagenome. were of 22 While the pmo was also identified in Proteobacteria (including bin 15), the mmo were of A ctinobacteria-origin only (Appendix 3.7). 23 Actinobacteria-origin only (Appendix 3.7). 24 D espite the wide distribution of genes for the CBB cycle for carbon fixation, there was 25 Despite the wide distribution of genes for the CBB cycle for carbon fixation, there was no chlorophyll or proteorhodopsin related structural genes for photosynthesis identified 26 no chlorophyll or proteorhodopsin related structural genes for photosynthesis identified in the entire metagenome. Instead, putative genes encoding carbon monoxide 27 in the entire metagenome. Instead, putative genes encoding carbon monoxide ) and type 1h/5 hydrogenase were widely identified (Appendix 28 dehydrogenase (coxSML) and type 1h/5 hydrogenase were widely identified (Appendix 3.3, Figure 3.7). This suggests these bacteria utilise alternative energy sources (such as 29 3.3, Figure 3.7). This suggests these bacteria utilise alternative energy sources (such as

and type 30 CO or H2) for ATP and NADPH recycling. The enzyme products of coxSML and type 67

1h/5 [NiFe]-hydrogenases have the capacity to provide alternative energy sources for 1 1h/5 [NiFe]-hydrogenases have the capacity to provide alternative energy sources for the CBB pathway and all of the bacterial draft genomes that are potentially capable of 2 the CBB pathway and all of the bacterial draft genomes that are potentially capable of carbon fixation, apart from bin 6 (Actinobacteria), contained one of, or both of these 3 carbon fixation, apart from bin 6 (Actinobacteria), contained one of, or both of these genes (Appendix 3.3). Unfortunately, we did not recover any complete uptake 4 genes (Appendix 3.3). Unfortunately, we did not recover any complete uptake hydrogenase gene sequences, but the translated protein sequence of the fragments 5 hydrogenase gene sequences, but the translated protein sequence of the fragments obtained did cluster within reference type 1h/5 [NiFe]-hydrogenases (Figure 3.7); 6 obtained did cluster within reference type 1h/5 [NiFe]-hydrogenases (Figure 3.7); including those identified within the candidate division WPS-2 and AD3 (bin 22, 12 and 7 including those identified within the candidate division WPS-2 and AD3 (bin 22, 12 and 13, respectively). Finally, aerobic carbon monoxide dehydrogenase requires 8 13, respectively). Finally, aerobic carbon monoxide dehydrogenase requires m olybdenum as a ; ABC transporters specific for molybdenum were identified 9 molybdenum as a cofactor; ABC transporters specific for molybdenum were identified in all putative carbon monoxide dehydrogenase containing draft genomes, confirming 10 in all putative carbon monoxide dehydrogenase containing draft genomes, confirming their potential role in lithotrophicc energy metabolism. 11 their potential role in lithotrophicc energy metabolism. 12 13 Figure 3.7 Maximum likelihood phylogenetic tree showing the type 1h taxonomy 14 classification of Robinson Ridge [NiFe]-hydrogenase large subunit sequences. N odes with bootstrap values greater than 0.75 after 1000 replications were denoted by ●, 15 Nodes with bootstrap values greater than 0.75 after 1000 replications were denoted by ●, . the 16 reference sequences were coloured blue, while binned sequences were coloured red. the m ajority of hydrogenases identified from Robinson Ridge were high affinity type 1h/5 17 majority of hydrogenases identified from Robinson Ridge were high affinity type 1h/5

18 [NiFe]-hydrogenase, capable of oxidise trophosperic H2. 19

68

1 69

1 3.4 Discussion

W hile no nitrogen fixation related genes were identified in the metagenome, 2 While no nitrogen fixation related genes were identified in the metagenome, C yanobacteria 16S rRNA genes were identified. As metabolic potentials and pathways 3 Cyanobacteria 16S rRNA genes were identified. As metabolic potentials and pathways w ere inferred from the metagenomic data, the lack of nitrogen fixation related genes and 4 were inferred from the metagenomic data, the lack of nitrogen fixation related genes and possibly other metabolic pathways may be due to insufficient sequencing depth. The 5 possibly other metabolic pathways may be due to insufficient sequencing depth. The sequencing depth of the 23 recovered genome bins varied from 11-60x (Table 3.4), 6 sequencing depth of the 23 recovered genome bins varied from 11-60x (Table 3.4), therefore the sequencing depth for bacteria at lower abundances can be expected to be 7 therefore the sequencing depth for bacteria at lower abundances can be expected to be even lower. Those low coverage reads may have failed to be assembled into longer 8 even lower. Those low coverage reads may have failed to be assembled into longer contigs and therefore were not identified here. However, due to their low abundance, 9 contigs and therefore were not identified here. However, due to their low abundance, these metabolic pathways are expected to only play a minor role in the ecosystem. 10 these metabolic pathways are expected to only play a minor role in the ecosystem. 11 T he CBB cycle was the only complete carbon fixation pathway identified in the 12 The CBB cycle was the only complete carbon fixation pathway identified in the

via ribulose 13 Robinson Ridge metagenome. Within the CBB cycle, RuBisCO fixes CO2 via ribulose 1,5-bisphosphate to produce 3-phosphoglycerate and the presence of the gene is 14 1,5-bisphosphate to produce 3-phosphoglycerate and the presence of the gene is com monly used as an indicator of carbon fixation capacity (Giri et al., 2004). The 15 commonly used as an indicator of carbon fixation capacity (Giri et al., 2004). The R uBisCO genes identified in Robinson Ridge were predominantly type IE (Figure 3.6), 16 RuBisCO genes identified in Robinson Ridge were predominantly type IE (Figure 3.6), and clustered with the RuBisCO identified in an ice caves of Mt Arebus (Tebo et al., 17 and clustered with the RuBisCO identified in an ice caves of Mt Arebus (Tebo et al., 2015). Therefore, the “novel” type I RuBisCO genes by Tebo et al. were actually type 18 2015). Therefore, the “novel” type I RuBisCO genes by Tebo et al. were actually type IE form. 19 IE form. 20 G enes required for CBB cycles are commonly clustered, and the arrangement of the 21 Genes required for CBB cycles are commonly clustered, and the arrangement of the gene cluster is indicative to the RubisCO types (red vs green) (Tabita et al., 2008). The 22 gene cluster is indicative to the RubisCO types (red vs green) (Tabita et al., 2008). The cbbX protein is a specific red-type RuBisCO activase (Mueller-Cajar et al., 2011), 23 cbbX protein is a specific red-type RuBisCO activase (Mueller-Cajar et al., 2011), w hich plays a major role in regulating RuBisCO large subunit activity. The putative 24 which plays a major role in regulating RuBisCO large subunit activity. The putative gene was identified within the RuBisCO gene cluster of candidate division WPS-2 25 cbbX gene was identified within the RuBisCO gene cluster of candidate division WPS-2 genes, such gene construct was 26 and AD3 and positioned downstream of rbcL and rbcS genes, such gene construct was 27 also observed in the ‘red type” chemolithoautotrophic bacteria such as Bradyrhizobium (Caldwell et al., 2007) and 28 japonicum (Badger et al., 2008), Sulfobacillus spp. (Caldwell et al., 2007) and C B1190 (Grostern et al., 2013). Therefore, bacteria in 29 Pseudonocardia dioxanivorans CB1190 (Grostern et al., 2013). Therefore, bacteria in R obinson Ridge that carry the red type RuBisCO are highly likely to be 30 Robinson Ridge that carry the red type RuBisCO are highly likely to be chem olithoautotrophs. 31 chemolithoautotrophs.

70

oxidation and evolution and are widely 1 [NiFe]-Hydrogenases are involved in H2 oxidation and evolution and are widely identified in bacteria and archaea (Shafaat et al., 2013, Greening et al., 2015a). They are 2 identified in bacteria and archaea (Shafaat et al., 2013, Greening et al., 2015a). They are classified into five distinct lineages and type 1h/5 [NiFe]-hydrogenase is the most 3 classified into five distinct lineages and type 1h/5 [NiFe]-hydrogenase is the most dom inant hydrogenase in aerated samples and account for 0.004 to 0.009% of the total 4 dominant hydrogenase in aerated samples and account for 0.004 to 0.009% of the total m etagenome in agricultural and forest soils (Greening et al., 2015a). Type 1h/5 5 metagenome in agricultural and forest soils (Greening et al., 2015a). Type 1h/5 hydrogenase is oxygen-tolerant, has a high-affinity to hydrogen with the ability to 6 hydrogenase is oxygen-tolerant, has a high-affinity to hydrogen with the ability to

(Greening et al., 2014a, Liot et al., 2015). It is also involved in 7 oxidize trophosperic H2 (Greening et al., 2014a, Liot et al., 2015). It is also involved in supplying energy to maintain bacterial dormancy (Greening et al., 2014a, Liot et al., 8 supplying energy to maintain bacterial dormancy (Greening et al., 2014a, Liot et al., 2015). 9 2015). 10 T ype 1h/5 [NiFe]-hydrogenases were the most dominant hydrogenase identified in 11 Type 1h/5 [NiFe]-hydrogenases were the most dominant hydrogenase identified in R obinson Ridge. Due to the extreme environmental condition and limited nutrient 12 Robinson Ridge. Due to the extreme environmental condition and limited nutrient availability in Antarctica, soil bacteria experience extended periods of dormancy 13 availability in Antarctica, soil bacteria experience extended periods of dormancy (Greening et al., 2015b, Kudinova et al., 2015), therefore this [NiFe]-hydrogenase 14 (Greening et al., 2015b, Kudinova et al., 2015), therefore this [NiFe]-hydrogenase potentially also plays an important role in maintaining cells viability in Antarctica 15 potentially also plays an important role in maintaining cells viability in Antarctica during polar winter. In addition, the bacteria carrying RuBisCO gene were predicted to 16 during polar winter. In addition, the bacteria carrying RuBisCO gene were predicted to use non-photosynthetic carbon fixation pathways and uptake type [NiFe]-hydrogenase 17 use non-photosynthetic carbon fixation pathways and uptake type [NiFe]-hydrogenase have been shown to provide energy for carbon fixation (Kuhns et al., 2016). Therefore, 18 have been shown to provide energy for carbon fixation (Kuhns et al., 2016). Therefore, the Type 1h/5 [NiFe]-hydrogenases identified in Robinson Ridge may also be involved 19 the Type 1h/5 [NiFe]-hydrogenases identified in Robinson Ridge may also be involved in carbon fixation within the extremely carbon limited Antarctic environment. 20 in carbon fixation within the extremely carbon limited Antarctic environment. 21 A mm onia oxidation plays a vital role in the nitrogen cycle and the ammonia oxidation 22 Ammonia oxidation plays a vital role in the nitrogen cycle and the ammonia oxidation efficiency of both ammonia-oxidising archaea and ammonia-oxidising bacteria are 23 efficiency of both ammonia-oxidising archaea and ammonia-oxidising bacteria are influenced by both soil pH (Zhang et al., 2012) and soil ammonia concentrations 24 influenced by both soil pH (Zhang et al., 2012) and soil ammonia concentrations (Verhamm e et al., 2011). Archaea are reported to be the primary ammonia oxidiser 25 (Verhamme et al., 2011). Archaea are reported to be the primary ammonia oxidiser under low pH and low ammonia conditions, while bacteria dominate both neutral and 26 under low pH and low ammonia conditions, while bacteria dominate both neutral and alkaline soils (Verhamme et al., 2011, Zhang et al., 2012). Only archaeal ammonia 27 alkaline soils (Verhamme et al., 2011, Zhang et al., 2012). Only archaeal ammonia m onooxygenase gene was detected in the Robinson Ridge metagenome combined with 28 monooxygenase gene was detected in the Robinson Ridge metagenome combined with a low soil pH of 5.5 and extremely low ammonia concentrations (7.8 mg/kg dry mass), 29 a low soil pH of 5.5 and extremely low ammonia concentrations (7.8 mg/kg dry mass), their major role in ammonia oxidation is not surprising. While the hydroxylamine 30 their major role in ammonia oxidation is not surprising. While the hydroxylamine ) gene was missing from the entire metagenome and the 31 dehydrogenase (Hao) gene was missing from the entire metagenome and the C renarchaeota draft genome (bin 19), evidences suggests Crenarchaeota use alternative 32 Crenarchaeota draft genome (bin 19), evidences suggests Crenarchaeota use alternative enzym es from the hydroxylamine dehydrogenase pathway to complete ammonia 33 enzymes from the hydroxylamine dehydrogenase pathway to complete ammonia 71 oxidation (Stahl et al., 2012). Thus, at Robinson Ridge Crenarchaeota play a major role 1 oxidation (Stahl et al., 2012). Thus, at Robinson Ridge Crenarchaeota play a major role in the nitrogen cycling, providing the capacity for nitrification and denitrification in this 2 in the nitrogen cycling, providing the capacity for nitrification and denitrification in this ecosystem. 3 ecosystem. 4 Secondary metabolites are compounds that are not essential for a ’s 5 Secondary metabolites are compounds that are not essential for a microorganism’s grow th, development and reproduction, but frequently provide the competitive 6 growth, development and reproduction, but frequently provide the competitive advantage for survival (Blin et al., 2013). The terpene biosynthesis pathway was the 7 advantage for survival (Blin et al., 2013). The terpene biosynthesis pathway was the m ost abundant secondary metabolites biosynthesis pathway observed within the 8 most abundant secondary metabolites biosynthesis pathway observed within the R obinson Ridge metagenome. Terpenes, also known as terpenoids or isoprenoids, are a 9 Robinson Ridge metagenome. Terpenes, also known as terpenoids or isoprenoids, are a large group of compounds whose diverse functions include anti-bacteria, anti-fungi, 10 large group of compounds whose diverse functions include anti-bacteria, anti-fungi, toxicity to various eukaryotic organisms and as messenger molecules (Gershenzon et al., 11 toxicity to various eukaryotic organisms and as messenger molecules (Gershenzon et al., 2007, Yamada et al., 2015). The knowledge on the function of the three predicted 12 2007, Yamada et al., 2015). The knowledge on the function of the three predicted terpene products discovered here is limited (Hopene, Geosmin and Isorenieratene, Table 13 terpene products discovered here is limited (Hopene, Geosmin and Isorenieratene, Table 3.4), with reports of hopene as a steroid, geosmin as a semi-volatile compound and and 14 3.4), with reports of hopene as a steroid, geosmin as a semi-volatile compound and and isorenieratene as a glycosylated carotenoid orange pigment all have been reported in 15 isorenieratene as a glycosylated carotenoid orange pigment all have been reported in A ctinobacterial genomes previously (Alloisio et al., 2005, Klausen et al., 2005, Guttman 16 Actinobacterial genomes previously (Alloisio et al., 2005, Klausen et al., 2005, Guttman et al., 2008, Maresca et al., 2008). A range of potential gene clusters for the production 17 et al., 2008, Maresca et al., 2008). A range of potential gene clusters for the production of secondary metabolites with antibacterial properties were also identified, such as 18 of secondary metabolites with antibacterial properties were also identified, such as O xazolomycin (Mori et al., 1985), Mannopeptimycin (He et al., 2004), Asukamycin 19 Oxazolomycin (Mori et al., 1985), Mannopeptimycin (He et al., 2004), Asukamycin (O mura et al., 1976) and BE-7585A (Sasaki et al., 2010). The presence of abundant 20 (Omura et al., 1976) and BE-7585A (Sasaki et al., 2010). The presence of abundant secondary metabolites biosynthesis genes identified here suggests Robinson Ridge is a 21 secondary metabolites biosynthesis genes identified here suggests Robinson Ridge is a potential target for natural bioactive compound prospecting. 22 potential target for natural bioactive compound prospecting. 23 W ithin the candidate phyla WPS-2 and AD3 genomes recovered, unique metabolic 24 Within the candidate phyla WPS-2 and AD3 genomes recovered, unique metabolic functional potential were identified including carbon fixation and denitrification 25 functional potential were identified including carbon fixation and denitrification capacities. Obtaining genomes from these uncharacterised phyla permits a thorough 26 capacities. Obtaining genomes from these uncharacterised phyla permits a thorough investigation into their genomic capacities and inference of their ecological role and 27 investigation into their genomic capacities and inference of their ecological role and surviving strategies in the harsh Antarctic environment. 28 surviving strategies in the harsh Antarctic environment. 29

72

1 Chapter 4 Characterisation of candidate

2 phylum WPS-2

A ll work presented here were completed by candidate. 3 All work presented here were completed by candidate.

4 4.1 Introduction

In chapters 2 and 3, candidate phyla WPS-2 and AD3 were found to present in Mitchell 5 In chapters 2 and 3, candidate phyla WPS-2 and AD3 were found to present in Mitchell

Peninsula and Robinson Ridge in unprecedented high abundance, thus allowing for the 6 Peninsula and Robinson Ridge in unprecedented high abundance, thus allowing for the

re-construction of the first two draft genomes from WPS-2 and three for AD3. Due to 7 re-construction of the first two draft genomes from WPS-2 and three for AD3. Due to

time constraints, this chapter is focused on the characterisation of candidate phylum 8 time constraints, this chapter is focused on the characterisation of candidate phylum

W PS-2 only. 9 WPS-2 only.

10

W PS-2 was first identified from soil contaminated with polychlorinated biphenyl 11 WPS-2 was first identified from soil contaminated with polychlorinated biphenyl

(No gales et al., 2001), with the designation of WPS meaning “wittenberg polluted soil” 12 (Nogales et al., 2001), with the designation of WPS meaning “wittenberg polluted soil”

(No gales et al., 2001). While WPS-2 specific PCR primers have been previously 13 (Nogales et al., 2001). While WPS-2 specific PCR primers have been previously

14 designed (Costello 2007, Camanocha et al., 2014), no WPS-2 specific Fluorescence In

Hybridisation (FISH) probe exists. FISH has been used widely to identify 15 situ Hybridisation (FISH) probe exists. FISH has been used widely to identify

individual microbial cells within their natural environment (Amann et al., 2001, Muggia 16 individual microbial cells within their natural environment (Amann et al., 2001, Muggia

et al., 2013, Schmidt et al., 2014). FISH has been widely employed to investigate 17 et al., 2013, Schmidt et al., 2014). FISH has been widely employed to investigate

bacterial cell morphology (Foreman et al., 2007) and community structures in various 18 bacterial cell morphology (Foreman et al., 2007) and community structures in various

environments including soils (Yakimov et al., 2004). In addition, FISH using isotopic 19 environments including soils (Yakimov et al., 2004). In addition, FISH using isotopic

labelled organic or inorganic molecules has been used to visualise bacterial populations 20 labelled organic or inorganic molecules has been used to visualise bacterial populations

that are able to uptake a labelled-substrate (Lee et al., 1999), while FISH-positive cells 21 that are able to uptake a labelled-substrate (Lee et al., 1999), while FISH-positive cells

can be sorted using flow cytometry, a strategy used for downstream single cell genomic 22 can be sorted using flow cytometry, a strategy used for downstream single cell genomic

and enrichment (Amann et al., 1990). 23 and enrichment (Amann et al., 1990).

73

T he aims of this chapter were to characterise candidate phylum WPS-2 by analysing the 1 The aims of this chapter were to characterise candidate phylum WPS-2 by analysing the

functional capacities of the two draft genomes recovered in chapter 3. Secondly, I aimed 2 functional capacities of the two draft genomes recovered in chapter 3. Secondly, I aimed

to design and validate a WPS-2 specific FISH probe to allow visualisation of the 3 to design and validate a WPS-2 specific FISH probe to allow visualisation of the

m orphology of WPS-2 cells extracted from Windmill Island, Antarctic soils, which have 4 morphology of WPS-2 cells extracted from Windmill Island, Antarctic soils, which have

been shown to harbor a high abundance of this uncultured phylum. 5 been shown to harbor a high abundance of this uncultured phylum.

6 4.2 Materials and methods

7 4.2.1 Genome analysis

W PS-2 draft genomes were retrieved from our shotgun metagenomics project (JGI 8 WPS-2 draft genomes were retrieved from our shotgun metagenomics project (JGI

G enome id: bin22, 2667527204; bin23, 2667527203, Chapter 4) and open reading 9 Genome id: bin22, 2667527204; bin23, 2667527203, Chapter 4) and open reading

frames were predicted and annotated using PROKKA (Seemann 2014). Genome 10 frames were predicted and annotated using PROKKA (Seemann 2014). Genome

annotation was performed using JGI (Nordberg et al., 2014), KEGG IDs were annotated 11 annotation was performed using JGI (Nordberg et al., 2014), KEGG IDs were annotated

using KEGG Automatic Annotation Server (Moriya et al., 2007), while 12 using KEGG Automatic Annotation Server (Moriya et al., 2007), while

carbohydrate-related catalytic domains were annotated using the Carbohydrate-Active 13 carbohydrate-related catalytic domains were annotated using the Carbohydrate-Active

enZY mes database (Cantarel et al., 2009). Gene families associated with the cell 14 enZYmes database (Cantarel et al., 2009). Gene families associated with the cell

envelope were identified by using a set of Hidden Markov Models (Yeoh et al., 2016), 15 envelope were identified by using a set of Hidden Markov Models (Yeoh et al., 2016),

w ith the inclusion of 11 reference genomes belonging to Actinobacteria, 16 with the inclusion of 11 reference genomes belonging to Actinobacteria,

A rmatimonadetes, Firmicutes, Proteobacteria, Thaumarchaeota and Verrucomicrobia. 17 Armatimonadetes, Firmicutes, Proteobacteria, Thaumarchaeota and Verrucomicrobia.

18 4.2.2 Probe design and optimisation

A WPS-2 specific FISH probe was designed using the ARB probe design function based 19 A WPS-2 specific FISH probe was designed using the ARB probe design function based

on the alignment of full length WPS-2 SSU gene sequences (Ludwig et al., 2004). 20 on the alignment of full length WPS-2 SSU gene sequences (Ludwig et al., 2004).

A dditionally the site of hybridisation was carefully selected based on SSU RNA 21 Additionally the site of hybridisation was carefully selected based on SSU RNA

22 accessibility (Fuchs et al., 1998). The specificity of the WPS-2 probe was tested in silico

using testprobe 3.0 against non-redundant SILVA Reference dataset (Quast et al., 2013). 23 using testprobe 3.0 against non-redundant SILVA Reference dataset (Quast et al., 2013).

74

1 4.2.3 Construction of a WPS-2 specific clone

D ue to the lack of WPS-2 positive controls, the WPS2-289 FISH probe was optimised 2 Due to the lack of WPS-2 positive controls, the WPS2-289 FISH probe was optimised

and validated using clone-FISH (Schramm et al., 2002). Bacterial SSU RNA genes were 3 and validated using clone-FISH (Schramm et al., 2002). Bacterial SSU RNA genes were

first amplified from Antarctic soil DNA using 27F/1492R (Table 4.1). A clone library 4 first amplified from Antarctic soil DNA using 27F/1492R (Table 4.1). A clone library

w as constructed using a PromegaTA cloning kit (Promega, Australia) following the 5 was constructed using a PromegaTA cloning kit (Promega, Australia) following the

m anufacturer’s instructions. Clones were Sanger sequenced (Ramaciotti, Australia) and 6 manufacturer’s instructions. Clones were Sanger sequenced (Ramaciotti, Australia) and

W PS-2 positive clones were identified, based on sequence similarity to the WPS-2 7 WPS-2 positive clones were identified, based on sequence similarity to the WPS-2

reference sequence AJ292684 (the plasmid is named pGEMT-WPS2SSU). 8 reference sequence AJ292684 (the plasmid is named pGEMT-WPS2SSU).

9

Fo r gene expression, the WPS-2 SSU gene was further cloned into an expression vector 10 For gene expression, the WPS-2 SSU gene was further cloned into an expression vector

(pET 24a) using ligation independent method. Clones carrying the confirmed 11 (pET24a) using ligation independent method. Clones carrying the confirmed

pG EM T-WPS 2SSU plasmid were grown in LB broth supplemented with ampicillin 12 pGEMT-WPS2SSU plasmid were grown in LB broth supplemented with ampicillin

g/ml) at 37 °C overnight. The pGEMT-WPS2SSU plasmids were then extracted 13 (100 g/ml) at 37 °C overnight. The pGEMT-WPS2SSU plasmids were then extracted using the Biorise plasmid purification kit (Biorise, Australia) and the WPS-2 SSU gene 14 using the Biorise plasmid purification kit (Biorise, Australia) and the WPS-2 SSU gene

w as further cloned into a pET24a plasmid (DNA2.0, USA), using the NEBuilder 15 was further cloned into a pET24a plasmid (DNA2.0, USA), using the NEBuilder

cloning kit (NEB, Australia) (the plasmid is named pET27a-WPSSSU) (Figure 4.1). In 16 cloning kit (NEB, Australia) (the plasmid is named pET27a-WPSSSU) (Figure 4.1). In

brief, the correct WPS-2 SSU RNA gene sequence was amplified using modified 17 brief, the correct WPS-2 SSU RNA gene sequence was amplified using modified

27F/1492R primers with linker adapters for the pET24a plasmid and the pET24a 18 27F/1492R primers with linker adapters for the pET24a plasmid and the pET24a

plasmid was linearised using primers pET24a-linF/pET24a-linR (Table 4.1). The 19 plasmid was linearised using primers pET24a-linF/pET24a-linR (Table 4.1). The

am plified WPS-2 SSU RNA gene was cloned into linearised pET24a plasmid using 20 amplified WPS-2 SSU RNA gene was cloned into linearised pET24a plasmid using

N EBuilder cloning kit (NEB, Australia) following manufacturer’s instruction. The 21 NEBuilder cloning kit (NEB, Australia) following manufacturer’s instruction. The

BL21 (DE3) cells 22 pET24a-WPS2SSU plasmid was transformed into competent E. coli BL21 (DE3) cells

prepared in-house (Sambrook et al., 2001). 23 prepared in-house (Sambrook et al., 2001). 24

75

A

1 B

2 3 Figure 4.1 The structure of the pET24a-WPS2SSU plasmid. T he WPS-2 SSU RNA gene is positioned downstream of a T7 promoter and a Lac 4 The WPS-2 SSU RNA gene is positioned downstream of a T7 promoter and a Lac

) within a pET24a plasmid (A). A detailed plasmid map 5 operator (regions coloured red) within a pET24a plasmid (A). A detailed plasmid map

show s the orientation of the WPS-2 (direction of arrow) SSU RNA gene insert, the 6 shows the orientation of the WPS-2 (direction of arrow) SSU RNA gene insert, the

position of primers, probe and the site of mutagenesis (B). RBS: ribosomal 7 position of primers, probe binding site and the site of mutagenesis (B). RBS: ribosomal

binding site. 8 binding site.

9

76

1 Table 4.1 PCR primers and probe used in this study. Primer name Sequence Purpose Reference 27F 5’- AGAGTTTGATCMTGGCTCAG-3’ Universal PCR primer (Lane 1991) 1492R 5’- TACGGYTACCTTGTTACGACTT-3’ Universal PCR primer (Lane 1991) 27F-pET24 5’-GGTCGCGGATCCGAAAGAGTTTGATCMTGGCTCAG-3’ Universal PCR primer This study 1492R-pET24 5’-TCGACGGAGCTCGAATACGGYTACCTTGTTACGACTT-3’ Universal PCR primer This study pET24a-linF 5’- TTCGGATCCGCGACCCAT -3’ pET24a linearisation This study pET24a-linR 5’- TTCGAGCTCCGTCGACAAGC -3’ pET24a linearisation This study WPS305G->AF 5’-CAAACCAGCTACCCATCGGAGTCTTGG-3’ Introduction of a single This study point mutation WPS305G->AR 5’-AGAGAACGACCAGCCACACTGGGACTG-3’ Introduction of a single This study point mutation Eub-338i-FAM 5’-GCTGCCTCCCGTAGGAGT[FAM]-3’ Universal bacterial (Amann et al., FISH probe 1990) WPS2-289-Cy3 5’- TCGCTCTCTCAAACCAGC[CY3]-3’ WPS-2 specific FISH This study probe

77

1 4.2.4 Clone-FISH

2 Clone-FISH was performed as described (Schramm et al., 2002). Briefly, BL21(DE3) E. cells containing the pET24a-WPS2SSU plasmid was cultured in LB containing 3 coli cells containing the pET24a-WPS2SSU plasmid was cultured in LB containing g/ml), at 37 °C with shaking at 200 rpm until an OD600 of 0.3-0.4 was 4 ampicillin (100 g/ml), at 37 °C with shaking at 200 rpm until an OD600 of 0.3-0.4 was with continued shaking 5 reached. The expression was induced by adding IPTG (1 mM) with continued shaking 170 mg/l) 6 for 1 h, cellular RNA level was further enriched by adding chloramphenicol (170 mg/l) Cells were harvested by centrifugation at 4000 g 7 and incubation for 4 h with shaking. Cells were harvested by centrifugation at 4000 g for 10 min at 4 °C, then fixed with 4% paraformaldehyde (PFA) overnight at 4 °C. 8 for 10 min at 4 °C, then fixed with 4% paraformaldehyde (PFA) overnight at 4 °C. 9 Standard FISH was performed as described (Fuchs et al., 1998). In brief, induced and 10 Standard FISH was performed as described (Fuchs et al., 1998). In brief, induced and cells were fixed on a glass slide air-dried and cells were dehydrated 11 non-induced E.coli cells were fixed on a glass slide air-dried and cells were dehydrated through sequential washing with 50, 80 and 100% ethanol. Slides were hybridised in 12 through sequential washing with 50, 80 and 100% ethanol. Slides were hybridised in hybridisation buffer containing 0.9 M NaCl, 20 mM Tris-HCl (pH 7.6), 0.01% Sodium 13 hybridisation buffer containing 0.9 M NaCl, 20 mM Tris-HCl (pH 7.6), 0.01% Sodium M each of generic bacterial probe Eub-338i-FAM (Fuchs 14 dodecyl sulfate (SDS) and 1M each of generic bacterial probe Eub-338i-FAM (Fuchs et al., 1998) and WPS2-289-Cy3 at 46 °C for 2 hr in a 50 ml polyethylene tube. Slides 15 et al., 1998) and WPS2-289-Cy3 at 46 °C for 2 hr in a 50 ml polyethylene tube. Slides w ere washed twice with 50 ml of washing buffer containing 0.9 M NaCl, 20 mM 16 were washed twice with 50 ml of washing buffer containing 0.9 M NaCl, 20 mM T ris-HCl, 0.01% SDS in a 50 ml polyethylene tube at 48 °C for 15 min. 17 Tris-HCl, 0.01% SDS in a 50 ml polyethylene tube at 48 °C for 15 min.

18 4.2.5 WPS-2 specific Clone-FISH optimisation

To test WPS2-289 probe specificity, sequences where the SSU gene varied by one 19 To test WPS2-289 probe specificity, sequences where the SSU gene varied by one using TestProbe 3.0 20 nucleotide to the target WPS2-289 sequence was identified in silico using TestProbe 3.0 (Quast et al., 2013). The most dominant sequence present following a one bp mismatch 21 (Quast et al., 2013). The most dominant sequence present following a one bp mismatch w as identified and a mutant clone was constructed using Q5® Site-Directed 22 was identified and a mutant clone was constructed using Q5® Site-Directed M utagenesis Kit (NEB), using primer pair WPS305G->AF and WPS305G->AR (Table 23 Mutagenesis Kit (NEB), using primer pair WPS305G->AF and WPS305G->AR (Table 1) (mutant plasmid was named pET24a-WPS2SSUm). Cloning, transformation and 24 1) (mutant plasmid was named pET24a-WPS2SSUm). Cloning, transformation and FISH were performed as described above except hybridisation buffer stringency was 25 FISH were performed as described above except hybridisation buffer stringency was optimised through the addition of 0, 10, 20 or 30% formamide. The washing buffer 26 optimised through the addition of 0, 10, 20 or 30% formamide. The washing buffer com position also varied according to the formamide concentration used (i.e., 10% 27 composition also varied according to the formamide concentration used (i.e., 10% formam ide = 0.45 M NaCl; 20% formamide = 0.225 M NaCl, 5 mM EDTA; 30% 28 formamide = 0.45 M NaCl; 20% formamide = 0.225 M NaCl, 5 mM EDTA; 30% formam ide = 0.112 mM NaCl, 5 mM EDTA). 29 formamide = 0.112 mM NaCl, 5 mM EDTA). 78

1 4.2.6 Epi-fluorescence microscopy

H ybridised cells were visualised on a BX61 motorised microscope with a LED light 2 Hybridised cells were visualised on a BX61 motorised microscope with a LED light source and a DP71 digital camera attachment (Olympus). The FAM fluorophore was 3 source and a DP71 digital camera attachment (Olympus). The FAM fluorophore was excited at 460-495 nm and detected at an emission wavelength of 510-550 nm, while 4 excited at 460-495 nm and detected at an emission wavelength of 510-550 nm, while the Cy3 fluorophore was excited at 528-550 nm and detected at 575-630 nm. 5 the Cy3 fluorophore was excited at 528-550 nm and detected at 575-630 nm.

6 4.2.7 Extraction of cells from soil using Nycodenz

A portion (0.25 g) of Mitchell Peninsula soil containing a high abundance of WPS-2 7 A portion (0.25 g) of Mitchell Peninsula soil containing a high abundance of WPS-2 m filtered 0.9% NaCl (2.5 ml) and vortexed at high 8 (>10%) was suspended in 0.02 m filtered 0.9% NaCl (2.5 ml) and vortexed at high speed for 30 min. Cells were left to settle for 5 min, then the supernatant was collected 9 speed for 30 min. Cells were left to settle for 5 min, then the supernatant was collected and divided into 2 portions, which were placed on top of 1 ml Nycodenz solution 10 and divided into 2 portions, which were placed on top of 1 ml Nycodenz solution (Axis-Shield, Norway) (density 1.3g/ml), centrifuged at 17,000 g for 30 min at 4 °C 11 (Axis-Shield, Norway) (density 1.3g/ml), centrifuged at 17,000 g for 30 min at 4 °C w hen two liquid fractions and an interface were observed. Both surface fractions as well 12 when two liquid fractions and an interface were observed. Both surface fractions as well as the interfaces were collected, pooled, then diluted with sterile 0.9% NaCl (10 ml) and 13 as the interfaces were collected, pooled, then diluted with sterile 0.9% NaCl (10 ml) and centrifuged at 17,000 g for 30 min at 4 °C. The supernatant was removed and the pellet 14 centrifuged at 17,000 g for 30 min at 4 °C. The supernatant was removed and the pellet containing bacterial cells was collected. The cells were then resuspended in sterile 0.9% 15 containing bacterial cells was collected. The cells were then resuspended in sterile 0.9% m polycarbonate membrane (PCM, Millipore) using 16 NaCl (50 l) and placed on a 0.2 m polycarbonate membrane (PCM, Millipore) using a filtration manifold (Carbon 14, Denmark). Cellular rRNA content was enriched by 17 a filtration manifold (Carbon 14, Denmark). Cellular rRNA content was enriched by place the membrane on 0.05x tryptic soil gellan gum plate for 6 hours at 4 degree, then 18 place the membrane on 0.05x tryptic soil gellan gum plate for 6 hours at 4 degree, then the cells were fixed with 4% PFA (Sigma-Aldrich) at 4 °C overnight. The membrane 19 the cells were fixed with 4% PFA (Sigma-Aldrich) at 4 °C overnight. The membrane l) and then FISH was performed 20 was then washed twice with sterile 0.9% NaCl (300 l) and then FISH was performed as described above. 21 as described above.

22 4.3 Results

23 4.3.1 Genomic analysis of WPS-2

T he two draft WPS-2 genomes (Bin 22 and 23) contained 2.6 and 1.9 megabases (Mb), 24 The two draft WPS-2 genomes (Bin 22 and 23) contained 2.6 and 1.9 megabases (Mb), w ith 2,528 and 1,987 open reading frames predicted for bin22 and 23, respectively 25 with 2,528 and 1,987 open reading frames predicted for bin22 and 23, respectively (Table 4.2). Both draft genomes were near-complete, with completeness estimated to be 26 (Table 4.2). Both draft genomes were near-complete, with completeness estimated to be

79

99.1 and 99.4%. Bin22 was larger than bin23 in genome size, but had similar GC 1 99.1 and 99.4%. Bin22 was larger than bin23 in genome size, but had similar GC content and coverage (Table 4.2). SSU RNA gene fragments were identified in both 2 content and coverage (Table 4.2). SSU RNA gene fragments were identified in both draft genomes, but 5S and 23S rRNA gene was not identified. In addition, a higher 3 draft genomes, but 5S and 23S rRNA gene was not identified. In addition, a higher num ber of tRNA genes were identified in bin22 compared with bin23. Higher number 4 number of tRNA genes were identified in bin22 compared with bin23. Higher number of genes was identified from bin22, but a similar proportion of the predicted genes was 5 of genes was identified from bin22, but a similar proportion of the predicted genes was assigned to a function and KEGG orthology category. The predicted proteins showed 6 assigned to a function and KEGG orthology category. The predicted proteins showed very low similarity to genes from known bacterial phyla, with only 12.6 and 12.2% of 7 very low similarity to genes from known bacterial phyla, with only 12.6 and 12.2% of the predicted proteins exhibiting >=60% sequence identity to proteins from other 8 the predicted proteins exhibiting >=60% sequence identity to proteins from other sequenced organisms (Appendix 4.1). Predicted proteins with >=60% sequence 9 sequenced organisms (Appendix 4.1). Predicted proteins with >=60% sequence hom ology were most similar to proteins identified from Proteobacteria, Firmicutes, 10 homology were most similar to proteins identified from Proteobacteria, Firmicutes, C hloroflexi, Actinobacteria and Acidobacteria. 11 Chloroflexi, Actinobacteria and Acidobacteria. 12 B oth draft genomes demonstrated limited polysaccharide hydrolysis capacity (Appendix 13 Both draft genomes demonstrated limited polysaccharide hydrolysis capacity (Appendix 4.2), with only one alpha-glucoamylase predicted from both draft genomes, with one 14 4.2), with only one alpha-glucoamylase predicted from both draft genomes, with one additional alpha-amylase within bin22. Both genomes are predicted to degrade amino 15 additional alpha-amylase within bin22. Both genomes are predicted to degrade amino saccharide such as acetylgalactosamine and hexosamines, as well as mannose-derived 16 saccharide such as acetylgalactosamine and hexosamines, as well as mannose-derived saccharides. An S-layer homology domain that anchors cellulosome onto the bacterial 17 saccharides. An S-layer homology domain that anchors cellulosome onto the bacterial cell surfaces was identified in bin23, while mannosyl glucosephosphorylase and 18 cell surfaces was identified in bin23, while mannosyl glucosephosphorylase and beta-glucansidase that potentially involved in degrading plant-derived materials were 19 beta-glucansidase that potentially involved in degrading plant-derived materials were identified from both draft genomes. 20 identified from both draft genomes. 21 T he glucose produced by these putative hydrolases could be used in glycolysis. Within 22 The glucose produced by these putative hydrolases could be used in glycolysis. Within the glycolysis pathway, only the first stage (from glucose-6P to glycerate-3P) was 23 the glycolysis pathway, only the first stage (from glucose-6P to glycerate-3P) was com pletely predicted in both draft genomes, but the conversion from glycerate-3P to 24 completely predicted in both draft genomes, but the conversion from glycerate-3P to phosphoenolpyruvate was incomplete (Figure 4.2). While bin22 lacked the capacity to 25 phosphoenolpyruvate was incomplete (Figure 4.2). While bin22 lacked the capacity to convert glyceraldehydes-3P to glycerate-2P, bin23 lacked the enzyme to convert 26 convert glyceraldehydes-3P to glycerate-2P, bin23 lacked the enzyme to convert glycerate-2P to phosphoenolpyruvate (Figure 4.2). Both draft genomes were capable of 27 glycerate-2P to phosphoenolpyruvate (Figure 4.2). Both draft genomes were capable of converting phosphoenolpyruvate into pyruvate and pyruvate into acetyl-CoA, which 28 converting phosphoenolpyruvate into pyruvate and pyruvate into acetyl-CoA, which could be used as substrate for TCA cycle. Within the TCA cycle, the complete gene set 29 could be used as substrate for TCA cycle. Within the TCA cycle, the complete gene set w as identified in bin22, while bin23 lacked fumarate hydratase for the conversion of 30 was identified in bin22, while bin23 lacked fumarate hydratase for the conversion of fum erate to maltate (Figure 4.2). 31 fumerate to maltate (Figure 4.2). 32

80

1 Table 4.2 Genome statistics of WPS-2. Population genome ID Bin22 Bin23 Completeness (%) 99.1 99.4 Genome size(Mb) 2.6 1.9 Contigs (#) 19 43 GC (%) 62.6 60.1 Coverage 25.4x 26.9x ORF 2529 1988 % genes with a predicted function 72.9 72.6 % KEGG orthology mapped 45.8 49.6 No. tRNA identified 48 36 SSU RNA gene 306 bp 439 bp 5S rRNA gene Not identified Not identified 23S rRNA gene Not identified Not identified 2

3 4 Figure 4.2 Overview of WPS-2 functional potential inferred from two draft 5 genomes. T he pathways and enzymes identified from the two draft genomes are color-coded 6 The pathways and enzymes identified from the two draft genomes are color-coded (bin22●, bin23○). The major difference between the two draft genomes was the 7 (bin22●, bin23○). The major difference between the two draft genomes was the presence of putative Calvin-Benson Cycle and high affinity [NiFe]-hydrogeanses, 8 presence of putative Calvin-Benson Cycle and high affinity [NiFe]-hydrogeanses, highlighting the potential for dark-fixation using atmospheric gases. A range of cell 9 highlighting the potential for dark-fixation using atmospheric gases. A range of cell secretion pathways identified in bin 22. TCA cycle: Tricarboxylic acid cycle. 10 secretion pathways identified in bin 22. TCA cycle: Tricarboxylic acid cycle. 11 81

T he majority of electron transport chain proteins were predicted from both draft 1 The majority of electron transport chain proteins were predicted from both draft genom es. However, the two draft genomes contained different terminal oxidases 2 genomes. However, the two draft genomes contained different terminal oxidases

-type quinol oxidase for bin23) (Figure 4.2). Both draft 3 (cbb3-type for bin22 and bd-type quinol oxidase for bin23) (Figure 4.2). Both draft synthesis of Nicotinamide Adenine Dinucleotide 4 genomes were capable of de novo synthesis of Nicotinamide Adenine Dinucleotide (NA D), Nicotinamide Adenine Dinucleotide Phosphate (NADP), CoA and Flavin 5 (NAD), Nicotinamide Adenine Dinucleotide Phosphate (NADP), CoA and Flavin A denine Dinucleotid (FAD), but the ubiquinone biosynthesis pathway was absent. Thus, 6 Adenine Dinucleotid (FAD), but the ubiquinone biosynthesis pathway was absent. Thus, it was unclear how the quinone pool was recycled. 7 it was unclear how the quinone pool was recycled. 8 C arbon fixation capacity was predicted in draft genome bin22 with the identification of 9 Carbon fixation capacity was predicted in draft genome bin22 with the identification of the large and small subunits of ribulose-bisphosphate carboxylase (Figure 4.2). However, 10 the large and small subunits of ribulose-bisphosphate carboxylase (Figure 4.2). However, fructose-1,6-bisphosphatase and Phosphoribulokinase, required for the inter-conversion 11 fructose-1,6-bisphosphatase and Phosphoribulokinase, required for the inter-conversion of Sedoheptulose-1,7-biphosphate and Sedoheptulose-7P and between Ribose-5P and 12 of Sedoheptulose-1,7-biphosphate and Sedoheptulose-7P and between Ribose-5P and R ibulose-1,5 biphosphate were not recovered (Figure 4.2). Carbon dioxide transporter 13 Ribulose-1,5 biphosphate were not recovered (Figure 4.2). Carbon dioxide transporter carbonic anhydrase was present in both draft genomes, while a high affinity type 1h/5 14 carbonic anhydrase was present in both draft genomes, while a high affinity type 1h/5 [N i-Fe] hydrogenase was identified in bin22. 15 [Ni-Fe] hydrogenase was identified in bin22. 16 A complete pentose phosphate pathway was predicted in both draft genomes, which 17 A complete pentose phosphate pathway was predicted in both draft genomes, which nucleotide 18 could provide NADPH and 5-Phosphoribosyl diphosphate for de novo nucleotide purine and pyrimidine biosynthesis pathways were 19 biosynthesis. Pathways for de novo purine and pyrimidine biosynthesis pathways were predicted in the both draft genomes, except for GTP and dGTP synthesis. Both draft 20 predicted in the both draft genomes, except for GTP and dGTP synthesis. Both draft 21 genomes lacked IMP dehydrogenase, which catalyses the rate-limiting reaction of de GTP biosynthesis that converts inosine monophosphate to xanthosine 22 novo GTP biosynthesis that converts inosine monophosphate to xanthosine m onophasphate. Compared with nucleotide biosynthesis, both draft genomes were 23 monophasphate. Compared with nucleotide biosynthesis, both draft genomes were biosynthesis capacity, with only complete 24 predicted to have limited amino acid de novo biosynthesis capacity, with only complete alanine, aspartate, glutamine, glycine, aspargine and cysteine biosynthesis pathways 25 alanine, aspartate, glutamine, glycine, aspargine and cysteine biosynthesis pathways identified. However, the genomes encoded for a wide range of peptidases including 26 identified. However, the genomes encoded for a wide range of peptidases including carboxypeptidase, aminopeptidase, endopeptidase, Prolyl tripeptidyl peptidase and 27 carboxypeptidase, aminopeptidase, endopeptidase, Prolyl tripeptidyl peptidase and that may assist amino acid acquisition, but amino required 28 oligopeptidases that may assist amino acid acquisition, but amino transferases required for amino acid degradation were not recovered. 29 for amino acid degradation were not recovered. 30 T he two WPS-2 draft genomes demonstrated limited nitrogen and sulfur metabolism 31 The two WPS-2 draft genomes demonstrated limited nitrogen and sulfur metabolism capacity. Anaerobic respiration via sulphate reduction was not identified, but bin22 is 32 capacity. Anaerobic respiration via sulphate reduction was not identified, but bin22 is predicted to have minimal sulphur assimilation capacity with the capacity to convert 33 predicted to have minimal sulphur assimilation capacity with the capacity to convert 82 sulphate to phosphoadenosine phosphosulfate. Nitrogen cycling capacity in WPS-2 was 1 sulphate to phosphoadenosine phosphosulfate. Nitrogen cycling capacity in WPS-2 was also limited, with no nitrogen assimilation genes predicted in either draft genome. The 2 also limited, with no nitrogen assimilation genes predicted in either draft genome. The only nitrogen cycling-related genes identified were a nitrite reductase (NADH) small 3 only nitrogen cycling-related genes identified were a nitrite reductase (NADH) small subunit in bin23 and a nitrite reductase (NO forming) in bin22. 4 subunit in bin23 and a nitrite reductase (NO forming) in bin22. 5 B oth draft genomes were capable of converting acetyl-CoA to malonyl-CoA, which is 6 Both draft genomes were capable of converting acetyl-CoA to malonyl-CoA, which is used as a substrate for fatty acid synthesis. The capacity to synthesise fatty acids up to 7 used as a substrate for fatty acid synthesis. The capacity to synthesise fatty acids up to 18C was predicted from both draft genomes. Terpenoid backbone biosynthesis was also 8 18C was predicted from both draft genomes. Terpenoid backbone biosynthesis was also predicted, with the capacity of synthesising of dimethylallyl-PP and geranyl-pp, both of 9 predicted, with the capacity of synthesising of dimethylallyl-PP and geranyl-pp, both of w hich can be used to synthesise terpenes, terpenoids, sterols and peptide glycan. 10 which can be used to synthesise terpenes, terpenoids, sterols and peptide glycan. H owever, the terpenoids that could be synthesized by the draft genomes were not found. 11 However, the terpenoids that could be synthesized by the draft genomes were not found. B oth WPS-2 draft genomes were capable of synthesising a range of murein 12 Both WPS-2 draft genomes were capable of synthesising a range of murein () and lipoprotein, while the comparison of cell envelop biosynthsis 13 (peptidoglycan) and lipoprotein, while the comparison of cell envelop biosynthsis related genes showed WPS-2 were most likely to be a diderm bacteria that most similar 14 related genes showed WPS-2 were most likely to be a diderm bacteria that most similar to Armatimonadetes (Appendix 4.3). Complete thiamine phosphate, Pantothenate, 15 to Armatimonadetes (Appendix 4.3). Complete thiamine phosphate, Pantothenate, L ipoate and Protoheme biosynthesis pathways were identified, in comparison, folate 16 Lipoate and Protoheme biosynthesis pathways were identified, in comparison, folate and VB 12 biosynthesis pathways were completely absent in both draft genomes. While 17 and VB12 biosynthesis pathways were completely absent in both draft genomes. While biotin biosynthesis pathways was completely absent in bin22, bin23 only lacked one 18 biotin biosynthesis pathways was completely absent in bin22, bin23 only lacked one gene encodes pimeloyl-[acyl-carrier protein] methyl ester esterase within the biotin 19 gene encodes pimeloyl-[acyl-carrier protein] methyl ester esterase within the biotin biosynthesis pathway. 20 biosynthesis pathway. 21 A BC-type transporters for phosphate, osmoprotectants (glycine and betaine) and iron 22 ABC-type transporters for phosphate, osmoprotectants (glycine and betaine) and iron com plexes were predicted from both draft genomes. Molybdate transporters were also 23 complexes were predicted from both draft genomes. Molybdate transporters were also , while bin23 24 predicted, but through different systems; bin22 appears to rely on ModABC, while bin23 . Complete ABC transporters for ribose were identified from both 25 relies on WtpABC. Complete ABC transporters for ribose were identified from both genom es and genes encoding multiple sugar transport system ATP binding domain were 26 genomes and genes encoding multiple sugar transport system ATP binding domain were also identified. However, the genes for the remaining oligosaccharide ABC transporters 27 also identified. However, the genes for the remaining oligosaccharide ABC transporters w ere not identified. Phospholipids, lipooligosaccharides and lipopolysaccharides were 28 were not identified. Phospholipids, lipooligosaccharides and lipopolysaccharides were identified from both draft genomes, while only transporters for branch-chain amino 29 identified from both draft genomes, while only transporters for branch-chain amino acids and oligopeptides for amino acid and peptide acquisition were detected. Genome 30 acids and oligopeptides for amino acid and peptide acquisition were detected. Genome analysis predicted bin22 to contain a variety of bacterial secretion systems including 31 analysis predicted bin22 to contain a variety of bacterial secretion systems including Sec-SRP, twin arginine targeting and type IV pili. In comparison, bin23 contained only 32 Sec-SRP, twin arginine targeting and type IV pili. In comparison, bin23 contained only Sec-SRP system. Additionally, genes encoding flagellar base fragment were also 33 Sec-SRP system. Additionally, genes encoding flagellar base fragment were also 83

identified in bin22. 1 identified in bin22.

2 4.3.2 In silico probe validation

designed WPS-2 FISH probe (WPS2-289; Table 4.1) covered 65.6% of WPS-2 3 The designed WPS-2 FISH probe (WPS2-289; Table 4.1) covered 65.6% of WPS-2 sequences deposited in SILVA SSU Ref database, matches from non-targeted taxa 4 sequences deposited in SILVA SSU Ref database, matches from non-targeted taxa and Methylophilales lineages with 1-7 hits 5 occurred for Rhizobiales Incertae Sedis and Methylophilales lineages with 1-7 hits observed. Allowing one mismatch introduced additional 2,469 non-specific matches. 6 observed. Allowing one mismatch introduced additional 2,469 non-specific matches. A mong these one mismatch sequences, 97% resulted from a single nucleotide variation 7 Among these one mismatch sequences, 97% resulted from a single nucleotide variation nucleotide position of the probe WPS2-289 8 from Guanine (G) to Adenine (A) at the 15th nucleotide position of the probe WPS2-289 position 305). 9 (E. coli position 305).

10 4.3.3 WPS-2 specific probe design and optimisation using

11 clone-FISH

Fo llowing FISH, fluorescence signals corresponding to both Eub-338i-FAM and 12 Following FISH, fluorescence signals corresponding to both Eub-338i-FAM and W PS2-289-Cy3 were observed following induction of the cloned WPS-2 SSU gene 13 WPS2-289-Cy3 were observed following induction of the cloned WPS-2 SSU gene (Figure 4.3C). As expected, Eub-338i-FAM signals were observed with or without 14 (Figure 4.3C). As expected, Eub-338i-FAM signals were observed with or without induction (Figure 4.3B), while WPS2-289-Cy3 signals were not observed without 15 induction (Figure 4.3B), while WPS2-289-Cy3 signals were not observed without morphological changes of an increased cell 16 induction (Figure 4.3A). In addition, E. coli morphological changes of an increased cell size and elongation were observed after induction (Figure 4.3C). These morphological 17 size and elongation were observed after induction (Figure 4.3C). These morphological changes were a result of chloramphenicol treatment (Schramm et al., 2002). 18 changes were a result of chloramphenicol treatment (Schramm et al., 2002). 19 position 305 20 The pET24a-WPS2SSUm plasmid carrying a point mutation at E. coli position 305 (G->A ) was constructed and its sequence was confirmed by sequencing. To optimise the 21 (G->A) was constructed and its sequence was confirmed by sequencing. To optimise the carrying 22 stringency/specificity of WPS2-289, clone-FISH on E. coli carrying pE T24a-WPS2S SUm were performed using 0, 10, 20 and 30% formamide. While 23 pET24a-WPS2SSUm were performed using 0, 10, 20 and 30% formamide. While signals from Eub-338i-FAM probe were detected, negative fluorescence signal from 24 signals from Eub-338i-FAM probe were detected, negative fluorescence signal from W PS2-289 was observed with all formamide concentrations tested (Figure 4.3D). 25 WPS2-289 was observed with all formamide concentrations tested (Figure 4.3D). T herefore hybridisation buffer containing 0% formamide was selected for WPS-2 FISH. 26 Therefore hybridisation buffer containing 0% formamide was selected for WPS-2 FISH. 27

84

Aa b

1 c d

2 3 Figure 4.3 Clone FISH for FISH optimisation using E. coli carrying WPS-2 16S 4 SSU gene plasmid. (A). No fluorescence signal corresponding to WPS2-289 probe was detected without the 5 (A). No fluorescence signal corresponding to WPS2-289 probe was detected without the rRN A were detected by 6 induction of WPS-2 16S SSU gene. (b). Host E. coli rRNA were detected by cells were 7 Eub-338i-FAM probe (green) (c). After induction, the majority host E. coli cells were positive for both probes (yellow). (d). Fluorescence corresponding to WPS2-289 was 8 positive for both probes (yellow). (d). Fluorescence corresponding to WPS2-289 was host carrying pET24a-WPS2SSUm at all formamide 9 not detected in E. coli host carrying pET24a-WPS2SSUm at all formamide rRNA detected by 10 concentrations tested, but only fluorescence corresponding to E. coli rRNA detected by m. 11 generic Eub-338i-FAM probe (green). Scales bar represent 10 m.

12 4.3.4 FISH on Antarctic soil bacteria

Fo llowing nycodenz extraction, FISH on isolated bacteria resulted in cells being 13 Following nycodenz extraction, FISH on isolated bacteria resulted in cells being positive for both Eub-338-FAM and WPS2-289-cy3 (Figure 4.4). The observed 14 positive for both Eub-338-FAM and WPS2-289-cy3 (Figure 4.4). The observed fluorescence signal from Eub-338i-FAM was weaker compared with WPS2-289-Cy3 15 fluorescence signal from Eub-338i-FAM was weaker compared with WPS2-289-Cy3 and can be attributed to the fluorophores used (Fuchs et al., 2001). As expected a 16 and can be attributed to the fluorophores used (Fuchs et al., 2001). As expected a proportion of cells were co-hybridised with both probes, with WPS2 hybridised cells 17 proportion of cells were co-hybridised with both probes, with WPS2 hybridised cells m (measured from 10 cells). 18 predominately cocci in shape, with diameter of 0.6-1.2 m (measured from 10 cells).

85

1 2 Figure 4.4 FISH following hybridisation with Eub-338i-FAM and WPS2-289-Cy3. B acterial cells were extracted from Antarctic soils and hybridised with Eub-338i-FAM 3 Bacterial cells were extracted from Antarctic soils and hybridised with Eub-338i-FAM (green) and WPS2-289-Cy3 (red). The images from both detection channels are 4 (green) and WPS2-289-Cy3 (red). The images from both detection channels are com bined with co-localisation resulting in orange fluorescence. WPS-2 cells were 5 combined with co-localisation resulting in orange fluorescence. WPS-2 cells were m. 6 observed to be small cocci, with diameters ranging between 0.6-1.2 m.

7

8 4.4 Discussion

A s discussed in chapter 3, carbon fixation capacity was predicted in bin22 based on the 9 As discussed in chapter 3, carbon fixation capacity was predicted in bin22 based on the presence of a type IE RuBisCO and a high affinity type 1h/5 [NiFe]-hydrogenase. 10 presence of a type IE RuBisCO and a high affinity type 1h/5 [NiFe]-hydrogenase. T herefore bin 22 is potentially capable of mixotrophic growth, while bin 23 is an 11 Therefore bin 22 is potentially capable of mixotrophic growth, while bin 23 is an 12 obligate heterotrophic. Both WPS-2 genomes recovered lack amino acid de novo biosynthesis pathways and contained wide variety of peptidases (carboxypeptidase, 13 biosynthesis pathways and contained wide variety of peptidases (carboxypeptidase, am inopeptidase, endopeptidase, Prolyl tripeptidyl peptidase and oligopeptidases. 14 aminopeptidase, endopeptidase, Prolyl tripeptidyl peptidase and oligopeptidases. T herefore, WPS-2 is most likely to acquire amino acids externally for protein 15 Therefore, WPS-2 is most likely to acquire amino acids externally for protein biosynthesis, which is similar to that predicted in CPR Peregrinibacteria (Anantharaman 16 biosynthesis, which is similar to that predicted in CPR Peregrinibacteria (Anantharaman et al., 2016). Though capable of surviving in extreme environments, WPS-2 is 17 et al., 2016). Though capable of surviving in extreme environments, WPS-2 is hypothesised to have a close association with plant and mosses, which is evidenced by 18 hypothesised to have a close association with plant and mosses, which is evidenced by the large number of genes encoding proteins that can utilise plant components 19 the large number of genes encoding proteins that can utilise plant cell wall components (Duckw orth et al., 1972, Johnson et al., 2013, Nakae et al., 2013). 20 (Duckworth et al., 1972, Johnson et al., 2013, Nakae et al., 2013). 21 86

Fo r the first time, WPS-2 bacterial cells extracted Antarctic soils were successfully 1 For the first time, WPS-2 bacterial cells extracted Antarctic soils were successfully visualised following FISH. However, the resolution of the image was poor and suffered 2 visualised following FISH. However, the resolution of the image was poor and suffered from strong background fluorescence. The background fluorescence was most likely to 3 from strong background fluorescence. The background fluorescence was most likely to be a result of non-specific binding of fluorescence labelled probes to the polycarbonate 4 be a result of non-specific binding of fluorescence labelled probes to the polycarbonate m embrane and residual autofluorescent soil particles (Zarda et al., 1997, Pernthaler et 5 membrane and residual autofluorescent soil particles (Zarda et al., 1997, Pernthaler et al., 2002). In addition, the RNA content in Antarctic bacteria is predicted to be much 6 al., 2002). In addition, the RNA content in Antarctic bacteria is predicted to be much lower than bacteria in temperate environments due to arrested protein biosynthesis 7 lower than bacteria in temperate environments due to arrested protein biosynthesis during dormancy (Hu et al., 1998, Barria et al., 2013). Therefore, more sensitive 8 during dormancy (Hu et al., 1998, Barria et al., 2013). Therefore, more sensitive fluorescence detection methods, such as catalysed reported deposition–fluorescence in 9 fluorescence detection methods, such as catalysed reported deposition–fluorescence in situ hybridization (CARD-FISH) would enhance the hybridisation signal strength and 10 situ hybridization (CARD-FISH) would enhance the hybridisation signal strength and improve the signal-to-noise ratio (Ferrari et al., 2006, Eickhorst et al., 2008). 11 improve the signal-to-noise ratio (Ferrari et al., 2006, Eickhorst et al., 2008). 12 C ARD -FISH is a technique frequently used to investigate bacteria in soil and marine 13 CARD-FISH is a technique frequently used to investigate bacteria in soil and marine environments (Ferrari et al., 2006, Tujula et al., 2006). In CARD-FISH, the probe is 14 environments (Ferrari et al., 2006, Tujula et al., 2006). In CARD-FISH, the probe is labelled with a horseradish peroxidase and fluorescence (HRP) and fluorescence signal 15 labelled with a horseradish peroxidase and fluorescence (HRP) and fluorescence signal from tyramides is activated and accumulated by HRP (Pernthaler et al., 2002). In 16 from tyramides is activated and accumulated by HRP (Pernthaler et al., 2002). In addition, Nycodenz extraction prior to FISH and the use of confocal laser scanning 17 addition, Nycodenz extraction prior to FISH and the use of confocal laser scanning m icroscope have been suggested to improve FISH quality on soil samples (Bertaux et 18 microscope have been suggested to improve FISH quality on soil samples (Bertaux et al., 2007). 19 al., 2007). 20 W ith the identification of a strong reliance of WPS-2 on surrounding bacteria for amino 21 With the identification of a strong reliance of WPS-2 on surrounding bacteria for amino acids and co-factor acquisitions, the bacteria that associated with WPS-2 is yet to be 22 acids and co-factor acquisitions, the bacteria that associated with WPS-2 is yet to be identified. In addition, with a such high abundance in Mitchell Peninsula and Robinson 23 identified. In addition, with a such high abundance in Mitchell Peninsula and Robinson R idge, the ecological driver shaping the WPS-2 abundance requires further 24 Ridge, the ecological driver shaping the WPS-2 abundance requires further investigation. 25 investigation. 26

87

1 Chapter 5 The phylogeny and environmental

2 drivers of WPS-2

A ll work presented here were completed by candidate. 3 All work presented here were completed by candidate.

4 5.1 Introduction

T he genomic capacity of candidate phylum WPS-2 isolated from Antarctic desert soils 5 The genomic capacity of candidate phylum WPS-2 isolated from Antarctic desert soils

w ere revealed in chapter 4. However, the phylogeny and the environmental drivers that 6 were revealed in chapter 4. However, the phylogeny and the environmental drivers that

shape the distribution of WPS-2, particularly members in high abundance in Polar 7 shape the distribution of WPS-2, particularly members in high abundance in Polar

R egion or globally is yet to be examined. 8 Region or globally is yet to be examined.

9

C urrently, inconsistencies have been observed in the classification of WPS-2 across the 10 Currently, inconsistencies have been observed in the classification of WPS-2 across the

available 16S rRNA gene databases. RDP and Greengenes both name WPS-2 as WPS-2 11 available 16S rRNA gene databases. RDP and Greengenes both name WPS-2 as WPS-2

(DeSantis et al., 2006, Cole et al., 2013), while the Silva database has designated it as 12 (DeSantis et al., 2006, Cole et al., 2013), while the Silva database has designated it as

W D 272 (Quast et al., 2013). Furthermore, EzBioCloud classifies WPS-2 within the 13 WD272 (Quast et al., 2013). Furthermore, EzBioCloud classifies WPS-2 within the

C yanobacteria (Chun et al., 2007). WPS-2 16S rRNA gene sequences have been widely 14 Cyanobacteria (Chun et al., 2007). WPS-2 16S rRNA gene sequences have been widely

reported globally in both environmental and host-associated samples such as the canine 15 reported globally in both environmental and host-associated samples such as the canine

oral (Dewhirst et al., 2012) and the human oral microbiome (Camanocha et 16 oral microbiome (Dewhirst et al., 2012) and the human oral microbiome (Camanocha et

al., 2014). Yet, the majority of the WPS-2 sequences have been recovered from soils and 17 al., 2014). Yet, the majority of the WPS-2 sequences have been recovered from soils and

in most cases are present in relatively low abundances of <2% (Kavamura et al., 2013, 18 in most cases are present in relatively low abundances of <2% (Kavamura et al., 2013,

Serkebaeva et al., 2013). 19 Serkebaeva et al., 2013).

20

U ntil today, the phylogeny of WPS-2 remains unresolved. When WPS-2 SSU RNA gene 21 Until today, the phylogeny of WPS-2 remains unresolved. When WPS-2 SSU RNA gene

sequence was first identified, a single WPS-2 clone (WD272) was recovered and its 22 sequence was first identified, a single WPS-2 clone (WD272) was recovered and its

phylogenetic position was thought to be in close proximity to Cyanobacteria and 23 phylogenetic position was thought to be in close proximity to Cyanobacteria and

88

D einococci (Nogales et al., 2001). In 2014, WPS-2 was proposed to be comprised of 1 Deinococci (Nogales et al., 2001). In 2014, WPS-2 was proposed to be comprised of

five lineages (sub-clusters) (Camanocha et al., 2014). Sub-clusters 1,4 and 5 were 2 five lineages (sub-clusters) (Camanocha et al., 2014). Sub-clusters 1,4 and 5 were

suggested to contain a mixture of environmental and host-associated sequences, while 3 suggested to contain a mixture of environmental and host-associated sequences, while

sub-clusters 2 and 3 were exclusively of environmental origin. 4 sub-clusters 2 and 3 were exclusively of environmental origin.

5

H ere, the high abundance of WPS-2 identified from Robinson Ridge and Mitchell 6 Here, the high abundance of WPS-2 identified from Robinson Ridge and Mitchell

Peninsula soils provides an excellent opportunity to investigate the ecological 7 Peninsula soils provides an excellent opportunity to investigate the ecological

significance of this candidate phylum. The aims of this chapter were to resolve the 8 significance of this candidate phylum. The aims of this chapter were to resolve the

phylogeny within the WPS-2 candidate phylum and to identify both biotic and abiotic 9 phylogeny within the WPS-2 candidate phylum and to identify both biotic and abiotic

environmental parameters that drive the high abundance, diversity and distribution of 10 environmental parameters that drive the high abundance, diversity and distribution of

W PS-2 in Antarctic desert soils. 11 WPS-2 in Antarctic desert soils.

12 5.2 Materials and methods

13 5.2.1 WPS-2 sequence retrieval and alignment

N ear-complete SSU RNA gene sequences (> 1300 bp) that were classified as WPS-2 or 14 Near-complete SSU RNA gene sequences (> 1300 bp) that were classified as WPS-2 or

W D 272 were retrieved from the Silva, Greengenes and RDP databases. Additionally, a 15 WD272 were retrieved from the Silva, Greengenes and RDP databases. Additionally, a

further 20,000 sequences were retrieved from the NCBI non-redundant nucleotide 16 further 20,000 sequences were retrieved from the NCBI non-redundant nucleotide

database (nt) using BLAST with the query sequence AJ292684 (the first recognized 17 database (nt) using BLAST with the query sequence AJ292684 (the first recognized

W PS-2 clone). From the 20,000 hits, short sequences (<1300 bp) were removed, and 18 WPS-2 clone). From the 20,000 hits, short sequences (<1300 bp) were removed, and

since no WPS-2 genome has been reported, the genomic sequences were also removed. 19 since no WPS-2 genome has been reported, the genomic sequences were also removed.

20

To ensure the phylogeny was not affected by the reference alignment database selected, 21 To ensure the phylogeny was not affected by the reference alignment database selected,

all retrieved sequences that were >1400 bp (n=8,424) were aligned against the 22 all retrieved sequences that were >1400 bp (n=8,424) were aligned against the

G reengenes (August 2013 release) or the Silva database (v123) separately, using the 23 Greengenes (August 2013 release) or the Silva database (v123) separately, using the

N AST algorithm implemented in the Mothur pipeline and then the common gaps within 24 NAST algorithm implemented in the Mothur pipeline and then the common gaps within

the alignments were removed using filter.seqs command in Mothur (DeSantis et al., 25 the alignments were removed using filter.seqs command in Mothur (DeSantis et al.,

89

2006, Schloss et al., 2009). All sequences were then classified using the SILVA 1 2006, Schloss et al., 2009). All sequences were then classified using the SILVA

Incremental Aligner against Greengenes, RDP and Silva (Pruesse et al., 2012) and the 2 Incremental Aligner against Greengenes, RDP and Silva (Pruesse et al., 2012) and the

pairwise similarity between sequences was calculated using the DNADIST function in 3 pairwise similarity between sequences was calculated using the DNADIST function in

the PHYL IP package (Plotree et al., 1989). 4 the PHYLIP package (Plotree et al., 1989).

5

A refined WPS-2 only phylogenetic tree was then constructed that was comprised of 6 A refined WPS-2 only phylogenetic tree was then constructed that was comprised of

sequences present within an identified WPS-2 cluster (Figure 5.2). Shorter WPS-2 7 sequences present within an identified WPS-2 cluster (Figure 5.2). Shorter WPS-2

sequences retrieved from our Antarctic SSU RNA gene pyrosequencing (approximately 8 sequences retrieved from our Antarctic SSU RNA gene pyrosequencing (approximately

330 bp) (Siciliano et al., 2014) and shotgun metagenomics (196 bp for bin22, 336 bp for 9 330 bp) (Siciliano et al., 2014) and shotgun metagenomics (196 bp for bin22, 336 bp for

bin 23; chapter 3) were also included. 10 bin 23; chapter 3) were also included.

11 5.2.2 Phylogenetic tree construction

M aximum -likelihood phylogenetic trees were constructed using Fasttree 2 using 12 Maximum-likelihood phylogenetic trees were constructed using Fasttree 2 using

generalized time-reversible (GTR) model (Price et al., 2010). The robustness of the 13 generalized time-reversible (GTR) model (Price et al., 2010). The robustness of the

phylogeny was tested with 1000 bootstrap replications and the trees were visualised in 14 phylogeny was tested with 1000 bootstrap replications and the trees were visualised in

D endroscope 3 (Huson et al., 2007). 15 Dendroscope 3 (Huson et al., 2007).

16 5.2.3 Metagenomic data retrieval

T he 454 amplicon pyrosequencing OTU abundance-by-sample matrix and 17 The 454 amplicon pyrosequencing OTU abundance-by-sample matrix and

environmental parameters published in Siciliano et al., (Siciliano et al., 2013, updated 18 environmental parameters published in Siciliano et al., (Siciliano et al., 2013, updated

2014) were used to investigate the ecological drivers of WPS-2 abundance in Antarctic 19 2014) were used to investigate the ecological drivers of WPS-2 abundance in Antarctic

soil. The selected dataset was comprehensive, comprised of bacterial and fungal 20 soil. The selected dataset was comprehensive, comprised of bacterial and fungal

diversity, as well as 67 measured environmental, geographical and soil physical 21 diversity, as well as 67 measured environmental, geographical and soil physical

parameters. The dataset contains 223 soil samples collected from Mitchell and 22 parameters. The dataset contains 223 soil samples collected from Mitchell and

B rowning Peninsula, Casey station, Robinson Ridge and Herring Islands, East 23 Browning Peninsula, Casey station, Robinson Ridge and Herring Islands, East

A ntarctica, as well as Spitsbergen Slijeringa, Spitsbergen Vestpynten, Alexandra Fjord 24 Antarctica, as well as Spitsbergen Slijeringa, Spitsbergen Vestpynten, Alexandra Fjord

H ighlands in the high Arctic (Figure 5.1). 25 Highlands in the high Arctic (Figure 5.1). 90

1

2 Figure 5.1 Map of the Antarctic and High Arctic locations studied.

M aps showing the three high Arctic (A) and five East Antarctic (B) sampling sites. 3 Maps showing the three high Arctic (A) and five East Antarctic (B) sampling sites.

(Reproduced from Zhang, 2016). 4 (Reproduced from Zhang, 2016).

5

6

91

1 5.2.4 Environmental parameter transformation and

2 normalisation

To reduce dataset complicity in all the following analyses, highly inter-correlated 3 To reduce dataset complicity in all the following analyses, highly inter-correlated

environmental parameters (linear correlation r >0.9) were dereplicated and 4 environmental parameters (linear correlation r >0.9) were dereplicated and

environmental parameters were transformed to reduce the skewness as appropriate 5 environmental parameters were transformed to reduce the skewness as appropriate

(Appendix 5.1). The resulting environmental-by-sample matrix was then 6 (Appendix 5.1). The resulting environmental-by-sample matrix was then

x)/SD) thereby reducing the data to a consistent scale. 7 centered-reduced ((xi -x)/SD) thereby reducing the data to a consistent scale.

8 5.2.5 Transformation of WPS-2 OTU abundance matrix

T he OTU abundance-by-sample matrix was aggregated to phylum level and the relative 9 The OTU abundance-by-sample matrix was aggregated to phylum level and the relative

abundances (%) of WPS-2 candidate phylum across all samples were calculated. Then, 10 abundances (%) of WPS-2 candidate phylum across all samples were calculated. Then,

the WPS-2 relative abundance-by-sample matrix was square-root transformed. 11 the WPS-2 relative abundance-by-sample matrix was square-root transformed.

12 5.2.6 Correlation analysis between WPS-2 OTUs and

13 environmental factors

T he Spearman correlation between the transformed WPS-2 relative 14 The Spearman correlation between the transformed WPS-2 relative

abundance-by-sample matrix and centered-reduced environmental parameters were 15 abundance-by-sample matrix and centered-reduced environmental parameters were

16 calculated using Hmisc package (Harrell Jr, 2008) to identify the significant (p <0.05)

parameters that correlated to WPS-2 abundance variations. 17 parameters that correlated to WPS-2 abundance variations.

18 5.2.7 Dominant WPS-2 OTUs and soil sample selection

Fifteen dominant WPS-2 OTUs (accounting for 92% of total WPS-2 reads) were 19 Fifteen dominant WPS-2 OTUs (accounting for 92% of total WPS-2 reads) were

selected based on their presence (>55 soil samples) and total abundance (≥1000 reads). 20 selected based on their presence (>55 soil samples) and total abundance (≥1000 reads).

To reduce potential variation caused by sequencing depth differences, the OTU matrix 21 To reduce potential variation caused by sequencing depth differences, the OTU matrix

w as standardised by random sub-sampling to the smallest coverage in the data set. The 22 was standardised by random sub-sampling to the smallest coverage in the data set. The 92

subsampled OTU matrix was square-root transformed to further reduce variation and 1 subsampled OTU matrix was square-root transformed to further reduce variation and

then converted into a dissimilarity matrix using Bray-Curtis distance (Clarke et al., 2 then converted into a dissimilarity matrix using Bray-Curtis distance (Clarke et al.,

2006). To reduce false correlation, soil samples were selected only if the total WPS-2 3 2006). To reduce false correlation, soil samples were selected only if the total WPS-2

relative abundance within a soil sample was ≥5%. 4 relative abundance within a soil sample was ≥5%.

5 5.2.8 Identifying key environmental parameters driving

6 WPS-2 distribution

<0.05) environmental parameters that explained the distribution of 7 Significant (p <0.05) environmental parameters that explained the distribution of

dom inant WPS-2 OTUs within the selected soil samples were identified using 8 dominant WPS-2 OTUs within the selected soil samples were identified using

distance-based linear modelling (DistLM). Distance-based redundancy analysis 9 distance-based linear modelling (DistLM). Distance-based redundancy analysis

(db-RD A) was then performed using Primer + Permanova 6 package to assess the 10 (db-RDA) was then performed using Primer + Permanova 6 package to assess the

degree to which the distribution of dominant WPS-2 OTUs could be explained by the 11 degree to which the distribution of dominant WPS-2 OTUs could be explained by the

environmental covariates (Clark, 1993). 12 environmental covariates (Clark, 1993).

13 5.2.9 Co-occurrence network analysis

N etwork analysis was conducted to identify the biotic and abiotic variables that 14 Network analysis was conducted to identify the biotic and abiotic variables that

correlated to dominant WPS-2 OTUs. The Spearman correlation between the subsample 15 correlated to dominant WPS-2 OTUs. The Spearman correlation between the subsample

then transformed WPS-2 abundance-by-sample matrix and normalized environmental 16 then transformed WPS-2 abundance-by-sample matrix and normalized environmental

parameters were calculated as described in 5.2.6. The Spearman’s correlation matrix 17 parameters were calculated as described in 5.2.6. The Spearman’s correlation matrix

w as converted into a network comprised of correlations ≥0.6 or ≤-0.6 using igraph 18 was converted into a network comprised of correlations ≥0.6 or ≤-0.6 using igraph

library (Csardi et al., 2006) and the networks were visualized using Cytoscape 3 19 library (Csardi et al., 2006) and the networks were visualized using Cytoscape 3

(Shannon et al., 2003). The nodes in the constructed network represented OTUs 20 (Shannon et al., 2003). The nodes in the constructed network represented OTUs

classified at 96% identity, which was selected to classify sequence at species level based 21 classified at 96% identity, which was selected to classify sequence at species level based

on the SSU RNA gene fragment amplified (Kim et al., 2011, Siciliano et al., 2014) or 22 on the SSU RNA gene fragment amplified (Kim et al., 2011, Siciliano et al., 2014) or

environmental parameters. The edges corresponded to correlations between nodes 23 environmental parameters. The edges corresponded to correlations between nodes

retained. 24 retained.

93

1 5.3 Results

2 5.3.1 Phylogeny of WPS-2

From the 18,906 retrieved SSU RNA gene sequences, 221 sequences were classified as 3 From the 18,906 retrieved SSU RNA gene sequences, 221 sequences were classified as

either WPS-2 (Greengenes) or WD272 (Silva), with only 161 sequences classified 4 either WPS-2 (Greengenes) or WD272 (Silva), with only 161 sequences classified

consistently by both databases (Table 5.1). In contrast, classification using RDP 5 consistently by both databases (Table 5.1). In contrast, classification using RDP

reference sequences failed to classify any sequence as WPS-2 and the sequences 6 reference sequences failed to classify any sequence as WPS-2 and the sequences

classified as WPS-2 by RDP were actually Planctomycetes. Among sequences classified 7 classified as WPS-2 by RDP were actually Planctomycetes. Among sequences classified

as WPS-2 or WD272, 19% were classified as SHA-109 Silva, while 5% were 8 as WPS-2 or WD272, 19% were classified as SHA-109 Silva, while 5% were

unclassified bacteria in Greengenes (Table 5.1). 9 unclassified bacteria in Greengenes (Table 5.1).

10

A t the phylum level, one cluster comprised of only WPS-2 (WD272) sequences was 11 At the phylum level, one cluster comprised of only WPS-2 (WD272) sequences was

observed in both the Silva and Greengenes phylogenetic trees (Figure 5.2). There were 12 observed in both the Silva and Greengenes phylogenetic trees (Figure 5.2). There were

five and six bacterial phyla that were in close proximity to the WPS-2 (WD272) cluster 13 five and six bacterial phyla that were in close proximity to the WPS-2 (WD272) cluster

and were highly consistent between the two phylogenetic trees. However, the naming of 14 and were highly consistent between the two phylogenetic trees. However, the naming of

these bacterial phyla was inconsistent between Silva and Greengenes classifications. 15 these bacterial phyla was inconsistent between Silva and Greengenes classifications.

T hese bacterial phyla are SHA-109/WS2 (classified under Silva/Greengenes database, 16 These bacterial phyla are SHA-109/WS2 (classified under Silva/Greengenes database,

sam e as below), /GAL15, Chloroflexi/AD3 and Chloroflexi/Chloroflexi. 17 same as below), Thermotogae/GAL15, Chloroflexi/AD3 and Chloroflexi/Chloroflexi.

A dditionally, for the phylogenetic tree using Silva as alignment reference, 18 Additionally, for the phylogenetic tree using Silva as alignment reference,

A rmatimonadetes was also in close proximity to WPS-2 (Figure 5.2C and D). 19 Armatimonadetes was also in close proximity to WPS-2 (Figure 5.2C and D).

Interestingly, WPS-2/WD272 sequences were further separated into two sub-clusters, 20 Interestingly, WPS-2/WD272 sequences were further separated into two sub-clusters,

one contained sequences classified as WD272/WPS-2, while the second were classified 21 one contained sequences classified as WD272/WPS-2, while the second were classified

as SHA -109/WPS-2 (Figure 5.2). 22 as SHA-109/WPS-2 (Figure 5.2). 23 24

94

1 Table 5.1 Taxonomic classifications of sequences designated as WPS-2 or WD272.

W hile majority of the 221 sequences were classified as WD272 and WPS-2 consistently, 2 While majority of the 221 sequences were classified as WD272 and WPS-2 consistently,

19% of the total sequences were classified as SHA-109 by Silva. In contrast, RDP failed 3 19% of the total sequences were classified as SHA-109 by Silva. In contrast, RDP failed

to classify any sequence into WPS-2. 4 to classify any sequence into WPS-2. Frequency (%) Silva RDP classification Greengenes classification classification 56 WD272 unclassified_Bacteria WPS-2 16 WD272 Bacteria WPS-2 1 WD272 Firmicutes WPS-2 15 SHA-109 unclassified_Bacteria WPS-2 3 SHA-109 Firmicutes WPS-2 1 SHA-109 Unclassified WPS-2 2 Proteobacteria unclassified_Bacteria WPS-2 2 WD272 Firmicutes Unclassified 1 WD272 Bacteroidetes Unclassified 1 WD272 unclassified_Bacteria Unclassified 1 WD272 Unclassified Unclassified 5

6 Figure 5.2 Phylogenetic trees of SSU RNA gene sequences associated with WPS-2.

Phylogenetic trees were constructed using retrieved sequences aligned using (A) Silva, 7 Phylogenetic trees were constructed using retrieved sequences aligned using (A) Silva,

or (B) Greengenes as the alignment reference. The maximum likelihood phylogenetic 8 or (B) Greengenes as the alignment reference. The maximum likelihood phylogenetic

trees were visualised as cladograms and nodes with bootstrap value ≥ 0.7 are marked 9 trees were visualised as cladograms and nodes with bootstrap value ≥ 0.7 are marked

w ith ●. The phylogenetic trees exhibited similar topologies with sequences classified 10 with ●. The phylogenetic trees exhibited similar topologies with sequences classified

w ithin same phylum clustering together. In close proximity to the WPS-2 (WD272) 11 within same phylum clustering together. In close proximity to the WPS-2 (WD272)

cluster were WS2/SHA-109, Thermotogae/GAL15 and Chloroflexi. The WPS-2 12 cluster were WS2/SHA-109, Thermotogae/GAL15 and Chloroflexi. The WPS-2

(W D272) sequences were further divided into two sub-clusters classified as (C) 13 (WD272) sequences were further divided into two sub-clusters classified as (C)

W D 272/WPS-2 or (D) SHA-109/WPS-2. 14 WD272/WPS-2 or (D) SHA-109/WPS-2.

95

A B

1 C D

2 3

96

T he average sequence identity between non-WPS-2 (WD272) and the reference WPS-2 1 The average sequence identity between non-WPS-2 (WD272) and the reference WPS-2

sequence (AJ292684) was 78% (Table 5.2), while the identities of sequences within the 2 sequence (AJ292684) was 78% (Table 5.2), while the identities of sequences within the

W D 272 (WPS-2) cluster were higher (87%). Interestingly, within the WPS-2 (WD272) 3 WD272 (WPS-2) cluster were higher (87%). Interestingly, within the WPS-2 (WD272)

candidate phylum, the average identities of the SHA-109/WPS-2 cluster was much 4 candidate phylum, the average identities of the SHA-109/WPS-2 cluster was much

lower (79%) than the WD272/WPS-2 cluster (90%), and was not different from other 5 lower (79%) than the WD272/WPS-2 cluster (90%), and was not different from other

non-W PS-2 phyla. 6 non-WPS-2 phyla.

7

8 Table 5.2 Identities between reference WPS-2 sequences and observed clusters

9 following phylogenetic analysis. Taxonomic classification of the No. Identities to AJ292684 majority sequences within the sequences Average Min Max cluster(Silva /Greengenes ) within (%) (%) (%) cluster WPS-2 (WD272) 71 87 76 100 WD272/ WPS-2 55 90 82 100 SHA-109/ WPS-2 16 79 76 80 Acidobacteria/ Acidobacteria 556 78 75 79 Chloroflexi /AD3 25 78 77 79 Aminicenantes /OP8 10 78 77 78 Thermotogae /GAL15 9 78 77 78 WS2/SHA-109 13 78 77 79 Actinobacteria/Actinobacteria 378 77 75 79 Armatimonadetes/Armatimonadetes 138 77 75 79 Chloroflexi/Chlrofoelxi 346 77 74 80 Cyanobacteria/Cyanobacteria 767 77 73 81 /Elusimicrobia 8 77 77 77 Firmicutes/Firmicutes 5317 77 73 80 Proteobacteria/Proteobacteria 752 77 73 81 /Synergistetes 10 77 76 78 Cyanobacteria/ZB3 7 77 75 78 TA06/AC1 2 75 75 75 10

97

A refined phylogenetic tree containing 252 sequences, confirmed the observation that 1 A refined phylogenetic tree containing 252 sequences, confirmed the observation that

W PS-2 sequences clustered into two distinct groups (SHA-109/WPS-2 versus 2 WPS-2 sequences clustered into two distinct groups (SHA-109/WPS-2 versus

W D 272/WPS-2; Figure 5.3), as differentiated within the Silva classification system. The 3 WD272/WPS-2; Figure 5.3), as differentiated within the Silva classification system. The

SH A-109/WP S-2 cluster was comprised of four sub-clusters; sub-clusters 1 and 4 was 4 SHA-109/WPS-2 cluster was comprised of four sub-clusters; sub-clusters 1 and 4 was com prised of both host-associated and environmental sequences, while sub-clusters 2 5 comprised of both host-associated and environmental sequences, while sub-clusters 2 and 3 contained exclusively environmental sequences. As the similarities of the 6 and 3 contained exclusively environmental sequences. As the similarities of the sequences within this cluster to WPS-2 reference sequences were lower than the 7 sequences within this cluster to WPS-2 reference sequences were lower than the

W D 272/WPS-2 cluster (Table 5.2), therefore this group was not considered as part of 8 WD272/WPS-2 cluster (Table 5.2), therefore this group was not considered as part of

W PS-2 candidate phylum here. 9 WPS-2 candidate phylum here.

10

W ithin the WD272/WPS-2 cluster, two sub-clusters (I and II) were separated at the 11 Within the WD272/WPS-2 cluster, two sub-clusters (I and II) were separated at the basal position (Figure 5.3) and the separation was caused by two insertions observed in 12 basal position (Figure 5.3) and the separation was caused by two insertions observed in the SSU RNA gene of sub-cluster I (Figure 5.4). Within the phylogenetic tree, 13 the SSU RNA gene of sub-cluster I (Figure 5.4). Within the phylogenetic tree, sub-cluster I contained 11 near-complete sequences obtained from soil, including 14 sub-cluster I contained 11 near-complete sequences obtained from soil, including grassland, aspen and high elevation cold-fumarole soil and seven 15 grassland, aspen rhizosphere and high elevation cold-fumarole soil and seven m etagenomics short reads (Figure 5.3; Subtree A), with identities to the WPS-2 16 metagenomics short reads (Figure 5.3; Subtree A), with identities to the WPS-2 reference sequences ranging between 82-85% (Table 5.2). Cluster II contained 175 17 reference sequences ranging between 82-85% (Table 5.2). Cluster II contained 175 near-complete sequences and nine short metagenomics reads (Figure 5.3; Subtree B and 18 near-complete sequences and nine short metagenomics reads (Figure 5.3; Subtree B and

C ) as well the reference WPS-2 sequence, with similarities ranging from 85 to 95% 19 C) as well the reference WPS-2 sequence, with similarities ranging from 85 to 95%

(excluding reference and short reads). The sequence origins of clusters II were more 20 (excluding reference and short reads). The sequence origins of clusters II were more diverse, including predominantly terrestrial (acid mine drainage, volcano, Antarctic 21 diverse, including predominantly terrestrial (acid mine drainage, volcano, Antarctic desert and caves, uranium or polychlorinated contaminated soils and rhizosphere) and 22 desert and caves, uranium or polychlorinated contaminated soils and rhizosphere) and m ost interestingly, one sequence classified as a plant endosymbiont and two sequences 23 most interestingly, one sequence classified as a plant endosymbiont and two sequences associated with from human skin (Figure 5.3). 24 associated with from human skin (Figure 5.3).

25

26 27

98

1 Figure 5.3 Refined WPS-2 phylogeny using sequences classified as WD272 or

2 WPS-2.

C hloroflexi sequences were included as outgroup and nodes with bootstrap values > 0.7 3 Chloroflexi sequences were included as outgroup and nodes with bootstrap values > 0.7

are marked with ●. Two clusters were readily distinguished, i.e., sequences classified as 4 are marked with ●. Two clusters were readily distinguished, i.e., sequences classified as

SH A-109/WP S-2 and sequences classified as WD272/WPS-2. Sequences from the 5 SHA-109/WPS-2 and sequences classified as WD272/WPS-2. Sequences from the

SH A-109/WP S-2 cluster were further grouped into four sub-clusters, while 6 SHA-109/WPS-2 cluster were further grouped into four sub-clusters, while

W D 272/WPS-2 sequences formed two sub-clusters (I and II), which were separated at 7 WD272/WPS-2 sequences formed two sub-clusters (I and II), which were separated at

the basal position. Sequences from cluster I were predominately of soil origin, while the 8 the basal position. Sequences from cluster I were predominately of soil origin, while the

sequences in cluster II were more diverse. The bin22 WPS-2 draft genome sequence 9 sequences in cluster II were more diverse. The bin22 WPS-2 draft genome sequence

(red) was positioned in cluster I, while bin23 was positioned in cluster II. Short 10 (red) was positioned in cluster I, while bin23 was positioned in cluster II. Short

m etagenome sequences (blue) were also identified in both clusters (Subtree A, B and 11 metagenome sequences (blue) were also identified in both clusters (Subtree A, B and

C ). 12 C).

13

99

1

2

100

1 2 Figure 5.4 Two insertions identified in cluster I WPS-2 SSU RNA gene sequences.

T he two insertions identified were eight to nine and eight bp long, with the numbers 3 The two insertions identified were eight to nine and eight bp long, with the numbers

sequence. 4 labeled corresponding to a reference E. coli sequence.

5 5.3.2 Environmental correlations to WPS-2 abundance

A mong the 223 Antarctic and Arctic samples analysed, the relative abundance of 6 Among the 223 Antarctic and Arctic samples analysed, the relative abundance of

W PS-2 varied between 0-25% (Figure 5.5), with WPS-2 almost exclusively abundant 7 WPS-2 varied between 0-25% (Figure 5.5), with WPS-2 almost exclusively abundant

(relative abundance >5%) in Mitchell Peninsula and Robinson Ridge soils. A strong 8 (relative abundance >5%) in Mitchell Peninsula and Robinson Ridge soils. A strong

negative correlation (r <-0.6) was observed between WPS-2 relative abundance and pH 9 negative correlation (r <-0.6) was observed between WPS-2 relative abundance and pH

as well as boron, magnesium, calcium and manganese (Figure 5.6), In contrast, 10 as well as boron, magnesium, calcium and manganese (Figure 5.6), In contrast,

m oderate correlations (r 0.3 to 0.6/-0.3 to -0.6) to geophysical parameters including 11 moderate correlations (r 0.3 to 0.6/-0.3 to -0.6) to geophysical parameters including

sand percentage, mude percentage, mean soil particle size, dry matter fraction and 12 sand percentage, mude percentage, mean soil particle size, dry matter fraction and

chem ical parameters was also observed (Figure 5.6). 13 chemical parameters was also observed (Figure 5.6).

14

101

1 2 Figure 5.5 Relative abundance of WPS-2 in polar soils.

W PS-2 was most abundant in soils from Mitchell Peninsula and Robinson Ridge, 3 WPS-2 was most abundant in soils from Mitchell Peninsula and Robinson Ridge,

reaching 25%. The relative abundance of this phyla at the six other sites investigated 4 reaching 25%. The relative abundance of this phyla at the six other sites investigated

w as <2%. 5 was <2%.

6

Phosphorus pH -0.8 Sand percentage -0.6 Mg -0.4 SIO2 -0.2 Ca 0 K2O 0.2 0.4 Mn 0.6 Mean soil particle size

B S

dry matter fraction Zn Mud Total phosphorus TIO2 7 8 Figure 5.6 Significant geophysical and chemical parameters that correlated to

9 WPS-2 relative abundances.

Strong correlations (r <-0.6 or >0.6) were observed between WPS-2 relative abundance 10 Strong correlations (r <-0.6 or >0.6) were observed between WPS-2 relative abundance

and pH, Boron, Magnesium, Calcium and Manganese. 11 and pH, Boron, Magnesium, Calcium and Manganese.

12

102

Fifteen environmental and geophysical factors that were significantly correlated to the 1 Fifteen environmental and geophysical factors that were significantly correlated to the distribution of dominant WPS-2 OTUs (Appendix 5.2) and these factors explaining 51.2% 2 distribution of dominant WPS-2 OTUs (Appendix 5.2) and these factors explaining 51.2% of the variation across two axes of the PCOA plot (Figure 5.7). Three distinct WPS-2 3 of the variation across two axes of the PCOA plot (Figure 5.7). Three distinct WPS-2

O TU groups were observed with their relative abundances differentially present in the 4 OTU groups were observed with their relative abundances differentially present in the

223 soil samples (Figure 5.7). Group 1 contained sub-cluster II OTU, while group 2 5 223 soil samples (Figure 5.7). Group 1 contained sub-cluster II OTU, while group 2 contained sub-cluster I OTU 589 and 2844 (Figure 5.7). In contrast, group 3 contained 6 contained sub-cluster I OTU 589 and 2844 (Figure 5.7). In contrast, group 3 contained

W PS-2 OT Us from two different WPS-2 sub-clusters (Figure 5.7). The environmental 7 WPS-2 OTUs from two different WPS-2 sub-clusters (Figure 5.7). The environmental vectors overlaid on the PCO plot revealed the group 2 was negatively correlated to total 8 vectors overlaid on the PCO plot revealed the group 2 was negatively correlated to total carbon and mud percentage, while group 1 and 3 were positively correlated to carbon 9 carbon and mud percentage, while group 1 and 3 were positively correlated to carbon and mud percentage (Figure 5.7). 10 and mud percentage (Figure 5.7).

11

T he same three WPS-2 OTU groups were also observed in the network analysis (Figure 12 The same three WPS-2 OTU groups were also observed in the network analysis (Figure

5.8). Group 1 and 2 were both associated with mud percentage, but with opposite 13 5.8). Group 1 and 2 were both associated with mud percentage, but with opposite correlation directions, while extensive associations with other bacteria were observed in 14 correlation directions, while extensive associations with other bacteria were observed in

O TUs from group 3. The associations were most frequently observed to Acidobacteria, 15 OTUs from group 3. The associations were most frequently observed to Acidobacteria,

C hloroflexi and Actinobacteria and less frequently from AD3 and Proteobacteria 16 Chloroflexi and Actinobacteria and less frequently from AD3 and Proteobacteria

(Figure 5.8). 17 (Figure 5.8).

18

103

Group 3

Group 1

Group 2

1 2 Figure 5.7 The distribution of dominant WPS-2 OTUs and the environmental

3 parameters that best explained their distributions.

Fifteen geophysical and chemical parameters explained 51.2% of WPS-2 abundance 4 Fifteen geophysical and chemical parameters explained 51.2% of WPS-2 abundance

across two axes. Environmental parameters and WPS-2 OTUs with spearman’s 5 across two axes. Environmental parameters and WPS-2 OTUs with spearman’s

correlation >0.2 and >0.6, respectively, were overlaid. Three groups of WPS-2 OTUs 6 correlation >0.2 and >0.6, respectively, were overlaid. Three groups of WPS-2 OTUs

), 7 were differentially identified, while group 1 comprised from WPS-2 sub-cluster II (red),

) and group 3 comprised 8 group 2 comprised sequences from WPS-2 sub-cluster I (purple) and group 3 comprised

). 9 from a mixture of sub-cluster I and II sequences (blue).

104

1

2 3 Figure 5.8 Network analysis of dominant WPS-2 OTUs to environmental

4 parameters and bacterial OTUs.

Po sitive correlations are indicated by black solid lines, while negative correlations are 5 Positive correlations are indicated by black solid lines, while negative correlations are

denoted by red dashed lines. The three groups of WPS-2 OTUs was consistently 6 denoted by red dashed lines. The three groups of WPS-2 OTUs was consistently

observed in both Figure 5.7 and in the network analysis. 7 observed in both Figure 5.7 and in the network analysis.

8

9

105

1 5.4 Discussion

T he sequences classified as WPS-2 according to Greengenes were indeed belonging to 2 The sequences classified as WPS-2 according to Greengenes were indeed belonging to

two different bacterial phyla i.e., SHA-109 and WPS-2, as evidenced by: the separation 3 two different bacterial phyla i.e., SHA-109 and WPS-2, as evidenced by: the separation

of the two phyla at the basal position of the phylogenetic trees (Figure 5.2 and 5.3); the 4 of the two phyla at the basal position of the phylogenetic trees (Figure 5.2 and 5.3); the

of SHA-109 and WD272; and the low sequence similarity 5 different sequence origin of SHA-109 and WD272; and the low sequence similarity

(<80% ) between the SHA-109 cluster to the reference WPS-2 sequence, compared 6 (<80%) between the SHA-109 cluster to the reference WPS-2 sequence, compared

w ith >80% for the WPS-2 sequence cluster (Table 5.2). 7 with >80% for the WPS-2 sequence cluster (Table 5.2).

8

T he phylogenetic structure within the WPS-2 and SHA-109 candidate phyla were 9 The phylogenetic structure within the WPS-2 and SHA-109 candidate phyla were

consistent with the phylogeny proposed by Camanocha and Dewhirst (Camanocha et al., 10 consistent with the phylogeny proposed by Camanocha and Dewhirst (Camanocha et al.,

2014). The proposed sub-cluster 5 were in fact the WD272/WPS-2 identified here, 11 2014). The proposed sub-cluster 5 were in fact the WD272/WPS-2 identified here,

w hile sub-clusters 1-4 were actually candidate phylum SHA-109. In addition to the 12 while sub-clusters 1-4 were actually candidate phylum SHA-109. In addition to the

current phylogeny, we have greatly expanded the WD272/WPS-2 phyla and propose it 13 current phylogeny, we have greatly expanded the WD272/WPS-2 phyla and propose it

to be comprised of two sub-clusters (Figure 5.3). The two insertions identified in 14 to be comprised of two sub-clusters (Figure 5.3). The two insertions identified in

sub-cluster I (Figure 5.4) could be used to design primers targeting each of the clusters 15 sub-cluster I (Figure 5.4) could be used to design primers targeting each of the clusters

separately. Most interestingly, the genome recovered (chapter 4, bin22 and 23) were 16 separately. Most interestingly, the genome recovered (chapter 4, bin22 and 23) were

positioned across the two separate WPS-2 sub-clusters (Figure 5.3), suggesting the two 17 positioned across the two separate WPS-2 sub-clusters (Figure 5.3), suggesting the two

sub-clusters potentially have different trophic strategies i.e., sub-cluster I as mixotrophy 18 sub-clusters potentially have different trophic strategies i.e., sub-cluster I as mixotrophy

and sub-cluster II as obligate heterotrophs. 19 and sub-cluster II as obligate heterotrophs.

20

W PS-2 was predominately identified from the acidic soils of Mitchell Peninsula and 21 WPS-2 was predominately identified from the acidic soils of Mitchell Peninsula and

R obinson Ridge (Figures 5.5 and 5.6), and the absence of WPS-2 from other Antarctic 22 Robinson Ridge (Figures 5.5 and 5.6), and the absence of WPS-2 from other Antarctic

sites cannot be explained by primer bias as same 16S rRNA gene primers were used 23 sites cannot be explained by primer bias as same 16S rRNA gene primers were used

across all sites (Siciliano et al., 2014). Similarly, a high relative abundance (23%) of 24 across all sites (Siciliano et al., 2014). Similarly, a high relative abundance (23%) of

W PS-2 was also reported in a acidic relic spring mound at Paint Pots Springs, Canada 25 WPS-2 was also reported in a acidic relic spring mound at Paint Pots Springs, Canada

(Grasby et al., 2013). Three groups of OTUs were identified that shared similar 26 (Grasby et al., 2013). Three groups of OTUs were identified that shared similar

distribution patterns and correlations to biotic and abiotic factors (Figures 5.7 and 5.8). 27 distribution patterns and correlations to biotic and abiotic factors (Figures 5.7 and 5.8). 106

Interestingly, the grouping was differentiated based on the positions of OTUs in the 1 Interestingly, the grouping was differentiated based on the positions of OTUs in the

W PS-2 phylogenetic tree. As the two WPS-2 sub-clusters are proposed to be comprised 2 WPS-2 phylogenetic tree. As the two WPS-2 sub-clusters are proposed to be comprised

of chemolithoautotrophs and heterotrophs, and the sub-cluster I OTUs were negatively 3 of chemolithoautotrophs and heterotrophs, and the sub-cluster I OTUs were negatively

correlated to carbon content and mud percentage (Figure 5.7), this result suggests the 4 correlated to carbon content and mud percentage (Figure 5.7), this result suggests the

two trophic strategies of WPS-2 have distinct preferences in soil geophysical and 5 two trophic strategies of WPS-2 have distinct preferences in soil geophysical and

chem ical parameters. 6 chemical parameters.

7

107

1 Chapter 6 Discussion and conclusions

T he polar desert soils of Mitchell Peninsula and Robinson Ridge host not only a unique 2 The polar desert soils of Mitchell Peninsula and Robinson Ridge host not only a unique

m icrobial community structure but also an unusual functional gene capacity. Soils from 3 microbial community structure but also an unusual functional gene capacity. Soils from

these sites are dominated by microbial dark matter, exhibiting high relative abundances 4 these sites are dominated by microbial dark matter, exhibiting high relative abundances

of W PS-2, AD3 and cultured Actinobacteria and Chloroflexi (Figure 2.3 and 3.2). This 5 of WPS-2, AD3 and cultured Actinobacteria and Chloroflexi (Figure 2.3 and 3.2). This

novel community structure is distinctly different from any other soil ecosystems 6 novel community structure is distinctly different from any other soil ecosystems

described to date, with high relative abundances of both AD3 and WPS-2 in the same 7 described to date, with high relative abundances of both AD3 and WPS-2 in the same

ecosystem, with unprecedented 28% of the combined relative abundance. This poorly 8 ecosystem, with unprecedented 28% of the combined relative abundance. This poorly

described ecosystem lacks genes for photosynthesis and instead, was dominated by a 9 described ecosystem lacks genes for photosynthesis and instead, was dominated by a

, 10 dark carbon fixation process based on the catabolism of trace atmospheric gases (H2, C O). 11 CO).

12 Atmospheric hydrogen gas as an energy source for

13 mixtrophic growth

In Antarctica, phototrophs including Cyanobacteria and algae are the primary producers 14 In Antarctica, phototrophs including Cyanobacteria and algae are the primary producers

w ithin most terrestrial ecosystems (Vincent et al., 1993, Wood et al., 2008, Lacap et al., 15 within most terrestrial ecosystems (Vincent et al., 1993, Wood et al., 2008, Lacap et al.,

2001, Makhalanyane et al., 2015). In contrast, permanently ice-covered aquatic 16 2001, Makhalanyane et al., 2015). In contrast, permanently ice-covered aquatic

ecosystems were dominated by chemolithoautotrophs that utilize energies derived from 17 ecosystems were dominated by chemolithoautotrophs that utilize energies derived from

18 the oxidation of methane, ammonium, reduced iron, sulfur or sulfide (Lauro et al., 2011,

K ong et al., 2012, Sattley et al., 2006, Bowman et al., 1997). 19 Kong et al., 2012, Sattley et al., 2006, Bowman et al., 1997).

20

A t Mitchell Peninsula and Robinson Ridge, Cyanobacteria were only present at 21 At Mitchell Peninsula and Robinson Ridge, Cyanobacteria were only present at

extremely low relative abundances (0.3%) and the genes for typical dark carbon fixation 22 extremely low relative abundances (0.3%) and the genes for typical dark carbon fixation

pathways such as methane, sulfide or ammonia oxidation, were also identified at very 23 pathways such as methane, sulfide or ammonia oxidation, were also identified at very

low abundances. Nevertheless, a high abundance of type IE RuBisCO was present in the 24 low abundances. Nevertheless, a high abundance of type IE RuBisCO was present in the

108

m etagenome. Type IE RuBisCO has been identified in Antarctic terrestrial ecosystems 1 metagenome. Type IE RuBisCO has been identified in Antarctic terrestrial ecosystems

such as soils and caves (Tebo et al., 2015, Chan et al., 2013). Though it has been 2 such as soils and caves (Tebo et al., 2015, Chan et al., 2013). Though it has been

proposed to carry out chemolithoautrophic carbon fixation (Tebo et al., 2015), the 3 proposed to carry out chemolithoautrophic carbon fixation (Tebo et al., 2015), the

energy source has never been identified. In this study, abundant type 1h/5 4 energy source has never been identified. In this study, abundant type 1h/5

[N iFe]-hydrogenases were identified within the Robinson Ridge metagenome, and 5 [NiFe]-hydrogenases were identified within the Robinson Ridge metagenome, and

frequently identified in the potential carbon fixers. Therefore, I propose that 6 frequently identified in the potential carbon fixers. Therefore, I propose that

atmospheric hydrogen gas is the energy source for the dark fixation identified at 7 atmospheric hydrogen gas is the energy source for the dark fixation identified at

R obinson RidgeHowever, hydrogen gas in the atmosphere only present at only 0.55-0.6 8 Robinson RidgeHowever, hydrogen gas in the atmosphere only present at only 0.55-0.6

ppm (approximately 24-26 nM) (Glueckauf et al., 1957), which is much lower than the 9 ppm (approximately 24-26 nM) (Glueckauf et al., 1957), which is much lower than the

K m reported for hydrogenases (>800 nM) (Conrad, 1996), and also lower than the Km 10 Km reported for hydrogenases (>800 nM) (Conrad, 1996), and also lower than the Km

of high affinity type 1h [NiFe]-hydrogenases (Km: 113 nM) (Greening et al., 2014). 11 of high affinity type 1h [NiFe]-hydrogenases (Km: 113 nM) (Greening et al., 2014).

T herefore, to support this hypothesis, the hydrogenase activity could be very low, or an 12 Therefore, to support this hypothesis, the hydrogenase activity could be very low, or an

unknow n H2 concentrating mechanism is to be identified, alternatively, there might be 13 unknown H2 concentrating mechanism is to be identified, alternatively, there might be

14 soil niches that provide elevated H2 in soils.

15

16

Type 1h/5 [NiFe]-hydrogenases have been reported to present in Actinobacteria, 17 Type 1h/5 [NiFe]-hydrogenases have been reported to present in Actinobacteria,

Proteobacteria, Chloroflexi and Acidobacteria (Constant et al., 2011, Greening et al., 18 Proteobacteria, Chloroflexi and Acidobacteria (Constant et al., 2011, Greening et al.,

2015a). It is now widely accepted that type 1h/5 [NiFe]-hydrogenases play a vital role 19 2015a). It is now widely accepted that type 1h/5 [NiFe]-hydrogenases play a vital role

in maintaining bacterial dormancy during periods of nutrient deprivation by liberating 20 in maintaining bacterial dormancy during periods of nutrient deprivation by liberating

oxidation (Berney et al., 21 electron into electron respiration chain via atmospheric H2 oxidation (Berney et al., 2014, Greening et al., 2014a, Greening et al., 2015b, Constant et al., 2010). In addition, 22 2014, Greening et al., 2014a, Greening et al., 2015b, Constant et al., 2010). In addition,

with 23 type 1h/5 [NiFe]-hydrogenases have also been shown to oxidise atmospheric H2 with organic carbon to enhance bacterial growth (Berney et al., 2014, Greening et al., 2014b). 24 organic carbon to enhance bacterial growth (Berney et al., 2014, Greening et al., 2014b).

M ost interestingly, it has also been reported to support chemolithoautrophic growth in 25 Most interestingly, it has also been reported to support chemolithoautrophic growth in

a few Actinobacterial lineages (Grostern et al., 2013). While type 1h/5 26 a few Actinobacterial lineages (Grostern et al., 2013). While type 1h/5

was reported to be unable to 27 [NiFe]-hydrogenase-bearing Mycobacterium smegmatis was reported to be unable to

109

1 grow with H2 as sole energy source (Berney et al., 2014), Pseudonocardia autotrophica have been demonstrated to be hydrogenotrophic and 2 and Pseudonocardia dioxanivorans have been demonstrated to be hydrogenotrophic and

(Kanai et al., 1960, 3 can grow chemolithoautotrophicly using atmospheric H2 and CO2 (Kanai et al., 1960, was the most dominant 4 Grostern et al., 2013). In Robinson Ridge, Pseudonocardia was the most dominant

A ctinobacteria identified (Table 3.3), therefore, it is highly possible that the oxidation of 5 Actinobacteria identified (Table 3.3), therefore, it is highly possible that the oxidation of

is supporting a chemolithoautotrophic community. 6 tropospheric level H2 is supporting a chemolithoautotrophic community. 7

It is well-known that chemolithoautotrophic carbon fixation is energy expensive, thus 8 It is well-known that chemolithoautotrophic carbon fixation is energy expensive, thus carbon fixation is often activated when bio-available carbon levels are low (Joshi et al., 9 carbon fixation is often activated when bio-available carbon levels are low (Joshi et al.,

as a sole energy source for carbon fixation is predicted to only occur in 10 2009). Using H2 as a sole energy source for carbon fixation is predicted to only occur in

concentration. This is evidenced by the fact 11 carbon-limited conditions with elevated H2 concentration. This is evidenced by the fact

as energy source was inhibited by the presence of 12 that the carbon fixation using H2 as energy source was inhibited by the presence of pyruvate within a cell and chemolithoautotrophic growth rates are slower compared to 13 pyruvate within a cell and chemolithoautotrophic growth rates are slower compared to in medium containing organic carbon (Grostern et al., 2013). In addition, within the 14 in medium containing organic carbon (Grostern et al., 2013). In addition, within the bacteria identified in Robinson Ridge that contain both type IE RuBisCO and type 15 bacteria identified in Robinson Ridge that contain both type IE RuBisCO and type

1h/5[NiFe]-hydrogenase genes, alternative carbon acquisition pathways were always 16 1h/5[NiFe]-hydrogenase genes, alternative carbon acquisition pathways were always identified. Therefore, I hypothesise that the type 1h/5 [NiFe]-hydrogenases play 17 identified. Therefore, I hypothesise that the type 1h/5 [NiFe]-hydrogenases play m ultiple roles in Antarctic ecosystem; during Antarctic summer when snowmelt 18 multiple roles in Antarctic ecosystem; during Antarctic summer when snowmelt replenish nitrogen and carbon in soils, type 1h/5 [NiFe]-hydrogenases couple the 19 replenish nitrogen and carbon in soils, type 1h/5 [NiFe]-hydrogenases couple the

with the oxidation of other organic compounds to allow 20 oxidation of atmospheric H2 with the oxidation of other organic compounds to allow m ixotrophic growth, while in Antarctic winter or organic carbon is depleted, type 1h/5 21 mixotrophic growth, while in Antarctic winter or organic carbon is depleted, type 1h/5

[N iFe]-hydrogenases provide energy to sustain bacterial dormancy or switch the 22 [NiFe]-hydrogenases provide energy to sustain bacterial dormancy or switch the m ixtrophic bacteria into chemolithoautotrophic mode. Compared with bio-available 23 mixtrophic bacteria into chemolithoautotrophic mode. Compared with bio-available

in the atmosphere 24 organic compounds in Antarctic soils, though the concentration of H2 in the atmosphere is albeit trace at only 0.6 ppm (Glueckauf et al., 1957), it exists at unlimited amounts 25 is albeit trace at only 0.6 ppm (Glueckauf et al., 1957), it exists at unlimited amounts

is a 26 and at a constant concentration (Greening et al., 2014a). Consequently, H2 is a dependable fuel source for bacteria in this highly unstable Antarctic environment. 27 dependable fuel source for bacteria in this highly unstable Antarctic environment.

110

as an energy source for mixotrophic 1 Therefore, the capability of using atmospheric H2 as an energy source for mixotrophic growth via the highly flexible type 1h/5 [NiFe]-hydrogenases provide a great 2 growth via the highly flexible type 1h/5 [NiFe]-hydrogenases provide a great

com petitive advantage for the bacteria thriving in the extremely carbon-limited and 3 competitive advantage for the bacteria thriving in the extremely carbon-limited and

unstable Antarctic dry desert soils. 4 unstable Antarctic dry desert soils.

5 Form IE RubisCO is abundant is Windmill Islands, and

6 Actinobacteria as potentially CO2 sink in Antarctica

T he RuBisCO gene is indicative of carbon fixation (John et al., 2007), and was 7 The RuBisCO gene is indicative of carbon fixation (John et al., 2007), and was

identified in a diverse range of bacteria in Robinson Ridge, including, for the first time, 8 identified in a diverse range of bacteria in Robinson Ridge, including, for the first time,

candidate division bacteria WPS-2 and AD3. The RuBisCO gene, is currently classified 9 candidate division bacteria WPS-2 and AD3. The RuBisCO gene, is currently classified

into four forms (I, II, III and IV) based on sequence homology and organism origin. Of 10 into four forms (I, II, III and IV) based on sequence homology and organism origin. Of

these, only form I and II are known to be involved in bacterial carbon fixation (Tabita et 11 these, only form I and II are known to be involved in bacterial carbon fixation (Tabita et

al., 2008). Type I RubisCO is further clustered into five forms (A, B, C, D and E). Form 12 al., 2008). Type I RubisCO is further clustered into five forms (A, B, C, D and E). Form

A and B are grouped as “green type” and includes RuBisCO from phototrophs such as 13 A and B are grouped as “green type” and includes RuBisCO from phototrophs such as

C yanobacteria, Proteobacteria, algae and green plants. In contrast, form C and D are 14 Cyanobacteria, Proteobacteria, algae and green plants. In contrast, form C and D are

grouped as “red type” and includes RubisCO present in non-photosynthetic 15 grouped as “red type” and includes RubisCO present in non-photosynthetic

Proteobacteria and red algae (Badger et al., 2008). Most interestingly, a novel form IE 16 Proteobacteria and red algae (Badger et al., 2008). Most interestingly, a novel form IE

has most recently been found in Actinobacteria and Verrucomicrobia that are capable of 17 has most recently been found in Actinobacteria and Verrucomicrobia that are capable of

chem olithoautotrophic carbon fixation (Park et al., 2009, Grostern et al., 2013). We here 18 chemolithoautotrophic carbon fixation (Park et al., 2009, Grostern et al., 2013). We here

provide further evidence to expand the diversity of type IE RubisCO, including two 19 provide further evidence to expand the diversity of type IE RubisCO, including two

novel bacterial lineages that carry this form of RubisCO. 20 novel bacterial lineages that carry this form of RubisCO.

21

T he Majority (60%) of the recovered type IE RuBisCO genes were from Actinobacteria, 22 The Majority (60%) of the recovered type IE RuBisCO genes were from Actinobacteria,

w hich suggests Actinobacteria is a major carbon fixer within Robinson Ridge. Moreover, 23 which suggests Actinobacteria is a major carbon fixer within Robinson Ridge. Moreover,

the highly environmental resilient Actinobacteria is widely distributed in Antarctic soils 24 the highly environmental resilient Actinobacteria is widely distributed in Antarctic soils

and have been identified at high abundances in soils from various regions such as 25 and have been identified at high abundances in soils from various regions such as

M cM urdo Dry Valleys and King George Island (Aislabie, Chhour et al., 2006, Smith, 26 McMurdo Dry Valleys and King George Island (Aislabie, Chhour et al., 2006, Smith, 111

To w et al., 2006, Teixeira, Peixoto et al., 2010). Therefore the type IE RuBisCO is 1 Tow et al., 2006, Teixeira, Peixoto et al., 2010). Therefore the type IE RuBisCO is

potentially widespread in Antarctic soils and must be playing important roles in carbon 2 potentially widespread in Antarctic soils and must be playing important roles in carbon

acquisition for bacteria surviving in the carbon-limited Antarctic environment. In 3 acquisition for bacteria surviving in the carbon-limited Antarctic environment. In

4 addition, Actinobacterial abundance has been reported to increase with increased CO2 (Lesaulnier et al., 2008). Therefore, Actinobacterial communities are potentially 5 (Lesaulnier et al., 2008). Therefore, Actinobacterial communities are potentially

6 important CO2 sinks in these environments.

7 Nitrogen balancing in the Windmill Islands polar

8 deserts

N itrogen fixation capacity has been widely reported in the Antarctic (Cowan et al., 2011, 9 Nitrogen fixation capacity has been widely reported in the Antarctic (Cowan et al., 2011,

C han et al., 2013), with the process usually carried out by Cyanobacteria and 10 Chan et al., 2013), with the process usually carried out by Cyanobacteria and

Proteobacteria (Raymond et al., 2004). However, the lower abundance of both 11 Proteobacteria (Raymond et al., 2004). However, the lower abundance of both

C yanobacteria (0.3%) and Proteobacteria (9.8%) in Mitchell Peninsula and Robinson 12 Cyanobacteria (0.3%) and Proteobacteria (9.8%) in Mitchell Peninsula and Robinson

R idge compared with other Antarctic soils, combined with a lack of detectable nitrogen 13 Ridge compared with other Antarctic soils, combined with a lack of detectable nitrogen

operon) suggests nitrogen availability must be a major 14 fixation genes (nifHDKEN operon) suggests nitrogen availability must be a major

limiting factor for bacterial survival in this extreme environment. 15 limiting factor for bacterial survival in this extreme environment.

16

T he bio-available nitrogen pool must be replenished via natural replenishment or 17 The bio-available nitrogen pool must be replenished via natural replenishment or

fixation processes such as snowmelt, volcanic activities and glacier melts, or through 18 fixation processes such as snowmelt, volcanic activities and glacier melts, or through

m ineralisation of nitrogen containing macromolecules such as chitin (Burkins et al., 19 mineralisation of nitrogen containing macromolecules such as chitin (Burkins et al.,

2000, Oppenheimer et al., 2005, Fowler et al., 2013). However, while no nitrogen 20 2000, Oppenheimer et al., 2005, Fowler et al., 2013). However, while no nitrogen

fixation related genes were identified from shotgun metagenomics, a low abundance of 21 fixation related genes were identified from shotgun metagenomics, a low abundance of

C yanobacteria with potential nitrogen fixation capacity were identified within the soil. 22 Cyanobacteria with potential nitrogen fixation capacity were identified within the soil.

T herefore, this low abundance of Cyanobacteria could contribute to nitrogen influx via 23 Therefore, this low abundance of Cyanobacteria could contribute to nitrogen influx via

nitrogen fixation activity, albeit probably at very low levels. Despite the potential lack 24 nitrogen fixation activity, albeit probably at very low levels. Despite the potential lack

of nitrogen influx in these soils, a diverse range of denitrifiers were present, thus the 25 of nitrogen influx in these soils, a diverse range of denitrifiers were present, thus the

production of gaseous nitric oxide, nitrous oxide or nitrogen gas during denitrification 26 production of gaseous nitric oxide, nitrous oxide or nitrogen gas during denitrification 112

could lead to nitrogen loss from the biological system in this nitrogen-limited 1 could lead to nitrogen loss from the biological system in this nitrogen-limited

environment (Luo et al., 2000, Ward et al., 2009). Combined with limited nitrogen 2 environment (Luo et al., 2000, Ward et al., 2009). Combined with limited nitrogen

influx, the capacity for nitrogen loss in this nutrient poor soil may ultimately lead to a 3 influx, the capacity for nitrogen loss in this nutrient poor soil may ultimately lead to a

net deficit of nitrogen, making this ecosystem unsustainable in long term. Therefore, the 4 net deficit of nitrogen, making this ecosystem unsustainable in long term. Therefore, the

rate of denitrification and any evidence for alternative mechanisms to replenish nitrogen 5 rate of denitrification and any evidence for alternative mechanisms to replenish nitrogen

into the ecosystem requires further investigation. 6 into the ecosystem requires further investigation.

7 Requirement of SSU gene standardisation among

8 databases

W hile retrieving SSU gene sequences for WPS-2, inconsistent naming of other bacterial 9 While retrieving SSU gene sequences for WPS-2, inconsistent naming of other bacterial

candidate phyla were observed; candidate phylum WS2 in Greengenes was SHA-109 in 10 candidate phyla were observed; candidate phylum WS2 in Greengenes was SHA-109 in

S ilva, TA06 in Silva was named as AC1 in Greengenes and candidate phyla ZB3 and 11 Silva, TA06 in Silva was named as AC1 in Greengenes and candidate phyla ZB3 and

A D3 were not classified as a distinct phyla in Silva. This leads to confusion on what 12 AD3 were not classified as a distinct phyla in Silva. This leads to confusion on what

bacterial community is reported to be present across studies when different SSU RNA 13 bacterial community is reported to be present across studies when different SSU RNA

gene databases were used to classify the sequences retrieved. For example, Tebo et al 14 gene databases were used to classify the sequences retrieved. For example, Tebo et al

(Tebo et al., 2015) reported that in ice caves of Mt Arebus, a novel form I RuBisCO 15 (Tebo et al., 2015) reported that in ice caves of Mt Arebus, a novel form I RuBisCO

cluster was present that was associated with Cyanobacteria group WD272 (classified 16 cluster was present that was associated with Cyanobacteria group WD272 (classified

according to Silva database at the time) (Quast et al., 2013, Tebo et al., 2015) and 17 according to Silva database at the time) (Quast et al., 2013, Tebo et al., 2015) and

W D 272 was hypothesized to be a chemolithoautotroph (Tebo et al., 2015). However, 18 WD272 was hypothesized to be a chemolithoautotroph (Tebo et al., 2015). However,

this is clearly misleading, as Cyanobacteria is only known for photosynthetic carbon 19 this is clearly misleading, as Cyanobacteria is only known for photosynthetic carbon

fixation (Whitton 2012). In fact, as revealed in chapter 4, the WD272 is a distinct 20 fixation (Whitton 2012). In fact, as revealed in chapter 4, the WD272 is a distinct

bacterial candidate phylum with potential chemolithoautotrophic carbon fixation 21 bacterial candidate phylum with potential chemolithoautotrophic carbon fixation

capacity. Youssef et al., compared all the phyla recognised by Greengenes and Silva, but 22 capacity. Youssef et al., compared all the phyla recognised by Greengenes and Silva, but

ignored the fact that many of the different phyla identified from the two databases were 23 ignored the fact that many of the different phyla identified from the two databases were

indeed the same (Youssef et al., 2015). Van Geothem et al., investigated the microbial 24 indeed the same (Youssef et al., 2015). Van Geothem et al., investigated the microbial

com munity in Antarctic lithobionts and soils where up to 11% sequences were 25 community in Antarctic lithobionts and soils where up to 11% sequences were

unclassified at phylum level (Van Goethem et al., 2016). However, the classification 26 unclassified at phylum level (Van Goethem et al., 2016). However, the classification 113

w as based on the RDP database that contains a high proportion of cultured sequences 1 was based on the RDP database that contains a high proportion of cultured sequences

(only recognizing 49 bacterial phyla), which is the lowest coverage among the three 2 (only recognizing 49 bacterial phyla), which is the lowest coverage among the three

databases (Silva, Greengenes, RDP) (Newton et al., 2012). This makes RDP 3 databases (Silva, Greengenes, RDP) (Newton et al., 2012). This makes RDP

less-accurate towards identification of uncultured candidate phyla as demonstrated in 4 less-accurate towards identification of uncultured candidate phyla as demonstrated in

chapter 4. Therefore, I propose that there is an urgent requirement for a standardisation 5 chapter 4. Therefore, I propose that there is an urgent requirement for a standardisation

of classification systems across databases to ensure appropriate databases are used to 6 of classification systems across databases to ensure appropriate databases are used to

classify sequences of interest. 7 classify sequences of interest.

8 Proposed functional capacities of candidate phyla

9 WPS-2 and AD3

To our knowledge this is the first report of draft genomes of candidate phyla WPS-2 and 10 To our knowledge this is the first report of draft genomes of candidate phyla WPS-2 and

A D3. I proposed important insights into not only the ecological roles these candidate 11 AD3. I proposed important insights into not only the ecological roles these candidate

phyla play in this extreme environment, but also information on their physiological 12 phyla play in this extreme environment, but also information on their physiological

properties such as cell envelope structure and metabolic potential, with two distinct 13 properties such as cell envelope structure and metabolic potential, with two distinct

trophic strategies discovered within both phyla. The genetic capacity for autotrophic 14 trophic strategies discovered within both phyla. The genetic capacity for autotrophic

carbon fixation was apparent in both assembled genomes of candidate phyla WPS-2 and 15 carbon fixation was apparent in both assembled genomes of candidate phyla WPS-2 and

A D3, while draft genomes without carbon fixation capacity were also identified (Figure 16 AD3, while draft genomes without carbon fixation capacity were also identified (Figure

4.2). The presence of alternative carbon acquisition pathways in candidate phyla WPS-2 17 4.2). The presence of alternative carbon acquisition pathways in candidate phyla WPS-2

and AD 3 indicated a facultative autotrophic lifestyle. 18 and AD3 indicated a facultative autotrophic lifestyle.

19

T he draft genomes indicated WPS-2 to be diderm with a broad metabolic capacity from 20 The draft genomes indicated WPS-2 to be diderm with a broad metabolic capacity from

) for 21 heterotrophs to facultative autotrophs that utilise atmospheric gases (CO2 and H2) for carbon fixation. Given the limited carbon acquisition capacity for both genomes and the 22 carbon fixation. Given the limited carbon acquisition capacity for both genomes and the

incomplete amino acids and GTP biosynthesis capacities, I propose WPS-2 would be 23 incomplete amino acids and GTP biosynthesis capacities, I propose WPS-2 would be

slow growing and will need to acquire these essential compounds externally to complete 24 slow growing and will need to acquire these essential compounds externally to complete

its metabolism. Due to the obligative aerobic nature of WPS-2, traditional cultivation in 25 its metabolism. Due to the obligative aerobic nature of WPS-2, traditional cultivation in

liquid broth may not be appropriate, as static liquid culture could not provided sufficient 26 liquid broth may not be appropriate, as static liquid culture could not provided sufficient 114

oxygen, while rapid agitation is likely to disrupt the symbiotic relationship between 1 oxygen, while rapid agitation is likely to disrupt the symbiotic relationship between

W PS-2 and its symbiont. In contrast, traditional agar-based cultivation is predicted to 2 WPS-2 and its symbiont. In contrast, traditional agar-based cultivation is predicted to

lead to WPS-2 been out-competed by fast-growing bacteria within the community. 3 lead to WPS-2 been out-competed by fast-growing bacteria within the community.

Instead, mimicking the natural environment through the soil substrate membrane system, 4 Instead, mimicking the natural environment through the soil substrate membrane system,

that allows growing of bacterial microcolonies on a flat surface, will ensure maximum 5 that allows growing of bacterial microcolonies on a flat surface, will ensure maximum

6 air exposure for cultivation (Ferrari et al., 2005). By incubating the system under CO2

, the soil slurry could be supplemented with amino acids and co-factors (such as 7 and H2, the soil slurry could be supplemented with amino acids and co-factors (such as V itamin B12 and folate) thus providing WPS-2 with those compounds they cannot 8 Vitamin B12 and folate) thus providing WPS-2 with those compounds they cannot

. Then the system could be screened for the growth of WPS-2 9 biosynthesise de novo. Then the system could be screened for the growth of WPS-2

using the WPS-2 specific FISH probe designed in this study. 10 using the WPS-2 specific FISH probe designed in this study.

11 Future research

W ith a high abundance of type IE RuBisCO and type 1h/5 high affinity 12 With a high abundance of type IE RuBisCO and type 1h/5 high affinity

[N iFe]-hydrogenases identified in the draft genomes recovered from Robinson Ridge, a 13 [NiFe]-hydrogenases identified in the draft genomes recovered from Robinson Ridge, a

novel type of chemolithoautotrophic carbon fixation strategy is proposed here. 14 novel type of chemolithoautotrophic carbon fixation strategy is proposed here.

Interestingly, similar Actinobacteria-dominated community structure have been reported 15 Interestingly, similar Actinobacteria-dominated community structure have been reported

frequently from Antarctic and non-polar regions (Ortiz et al., 2014). This suggests gas 16 frequently from Antarctic and non-polar regions (Ortiz et al., 2014). This suggests gas

scavenging surviving strategies could be ubiquitous to the microbial community in 17 scavenging surviving strategies could be ubiquitous to the microbial community in

A ntarctic continent. In addition, it could also present in other non-polar, carbon-limited 18 Antarctic continent. In addition, it could also present in other non-polar, carbon-limited

extreme environments such as hot deserts, volcanic soils, and contribute toward global 19 extreme environments such as hot deserts, volcanic soils, and contribute toward global

cycling. Therefore, screening for the abundance and expression level of 20 CO2 and H2 cycling. Therefore, screening for the abundance and expression level of both high affinity type 1h/5 [NiFe]-hydrogenases, and type IE RuBisCO across polar 21 both high affinity type 1h/5 [NiFe]-hydrogenases, and type IE RuBisCO across polar

and non-polar regions is required. The correlation between gene expression level and 22 and non-polar regions is required. The correlation between gene expression level and

soil carbon availability would also provide valuable information on the environmental 23 soil carbon availability would also provide valuable information on the environmental

adaptation strategies and energy metabolism by soil dwelling bacteria in extreme 24 adaptation strategies and energy metabolism by soil dwelling bacteria in extreme

environments globally. 25 environments globally.

26

115

T he proposition of a novel primary production strategy reliant of the use of exclusive 1 The proposition of a novel primary production strategy reliant of the use of exclusive

), indicates the existence of the third carbon fixation 2 atmospheric gases (CO, CO2), indicates the existence of the third carbon fixation m echanism, distinct from phototrophy and geothermal chemotrophy. This hypothesis 3 mechanism, distinct from phototrophy and geothermal chemotrophy. This hypothesis

by soil 4 still requires further experimental support such as the consumption of H2 by soil

5 bacteria using gas chromatography (Greening et al., 2014) and the incorporation of CO2 using stable isotope labelling (Friedrich 2006). In addition, Antarctic is one of the most 6 using stable isotope labelling (Friedrich 2006). In addition, Antarctic is one of the most

heavily impacted regions by global warming on Earth (Royles et al., 2013), field 7 heavily impacted regions by global warming on Earth (Royles et al., 2013), field

surveys on Robinson Ridge are required to investigate the response of the unique 8 surveys on Robinson Ridge are required to investigate the response of the unique

com munity revealed here to global warming, and the effect of increased temperature on 9 community revealed here to global warming, and the effect of increased temperature on

the hydrogenase and type IE RuBisCO abundance and activity in Antarctica. 10 the hydrogenase and type IE RuBisCO abundance and activity in Antarctica.

11

L ong term nitrogen loss was predicted to occur in Robinson Ridge based on the 12 Long term nitrogen loss was predicted to occur in Robinson Ridge based on the

presence of a complete denitrification pathway, but the absence of nitrogen fixation. A 13 presence of a complete denitrification pathway, but the absence of nitrogen fixation. A

high gene abundance associated with denitrification pathways have also been identified 14 high gene abundance associated with denitrification pathways have also been identified

from soils in McMurdo Dry Valleys (Chan et al., 2013). In that case, it was proposed 15 from soils in McMurdo Dry Valleys (Chan et al., 2013). In that case, it was proposed

that denitrification is a tightly controlled process and only occurs during the Antarctic 16 that denitrification is a tightly controlled process and only occurs during the Antarctic

sum mer, in waterlogged soils when snowmelt replenish nitrogen levels. Therefore, a 17 summer, in waterlogged soils when snowmelt replenish nitrogen levels. Therefore, a

field investigation would be required to resolve the nitrogen balance of Mitchell 18 field investigation would be required to resolve the nitrogen balance of Mitchell

Peninsula and Robinson Ridge soils and identify potential sources of nitrogen influx. 19 Peninsula and Robinson Ridge soils and identify potential sources of nitrogen influx.

20 Conclusions

E xploration of the metabolic potential and starvation survival strategies of bacteria 21 Exploration of the metabolic potential and starvation survival strategies of bacteria

w ithin the extremely oligotrophic, cold and dry desert soils of East Antarctic soil 22 within the extremely oligotrophic, cold and dry desert soils of East Antarctic soil

provides strong evidence for the dominance of a facultative chemolithoautotrophic 23 provides strong evidence for the dominance of a facultative chemolithoautotrophic

lifestyle. Actinobacteria dominate the system and appeared to have adapted to the 24 lifestyle. Actinobacteria dominate the system and appeared to have adapted to the

extreme environmental conditions by obtaining the capacity to oxidise hydrogen or 25 extreme environmental conditions by obtaining the capacity to oxidise hydrogen or

carbon monoxide gas, as energy sources to grow under carbon-poor conditions. The 26 carbon monoxide gas, as energy sources to grow under carbon-poor conditions. The

116 recovery of the first draft genomes from candidate division WPS-2 and AD3 provided 1 recovery of the first draft genomes from candidate division WPS-2 and AD3 provided impo rtant insights into these unknown phyla, both exhibited diverse metabolic lifestyles 2 important insights into these unknown phyla, both exhibited diverse metabolic lifestyles including the capacity for chemolithotrophic carbon fixation using trace gases. Both 3 including the capacity for chemolithotrophic carbon fixation using trace gases. Both phyla acquired a similar strategy of facultative carbon fixation to not only survive 4 phyla acquired a similar strategy of facultative carbon fixation to not only survive dorm ancy, but thrive under the nutrient poor conditions experienced in Robinson Ridge. 5 dormancy, but thrive under the nutrient poor conditions experienced in Robinson Ridge.

T he widespread presence of a novel type IE RubisCO and presence of a new primary 6 The widespread presence of a novel type IE RubisCO and presence of a new primary production process based on dark-carbon fixation, powered by the oxidation of 7 production process based on dark-carbon fixation, powered by the oxidation of hydrogen and carbon monoxide gas needs further confirmation. I believe atmospheric 8 hydrogen and carbon monoxide gas needs further confirmation. I believe atmospheric carbon fixation is providing a competitive advantage for these novel bacterial phyla 9 carbon fixation is providing a competitive advantage for these novel bacterial phyla under starvation and extended dormancy conditions experienced in this extreme polar 10 under starvation and extended dormancy conditions experienced in this extreme polar desert. 11 desert. 12

117

1 References

2001. Effects of oil spills on 2 Aislabie, J., Fraser, R., Duncan, S. and Farrell, R.L. 2001. Effects of oil spills on

:308-313. 3 microbial heterotrophs in Antarctic soils. Polar Biology 24:308-313.

2004. Hydrocarbon 4 Aislabie, J.M., Balks, M.R., Foght, J.M. and Waterhouse, E.J. 2004. Hydrocarbon

spills on Antarctic soils: effects and management. Environmental Science & Technology 5 spills on Antarctic soils: effects and management. Environmental Science & Technology

:1265-1274. 6 38:1265-1274.

2008. Relation between soil classification and 7 Aislabie, J., Jordan, S. and Barker, G. 2008. Relation between soil classification and

:9-20. 8 bacterial diversity in soils of the Ross Sea region, Antarctica. Geoderma 144:9-20.

2009. 9 Aislabie, J., Jordan, S., Ayton, J., Klassen, J., Barker, G. and Turner, S. 2009.

B acterial diversity associated with ornithogenic soil of the Ross Sea region, Antarctica 10 Bacterial diversity associated with ornithogenic soil of the Ross Sea region, Antarctica

T his article is one of a selection of papers in the Special Issue on Polar and Alpine 11 This article is one of a selection of papers in the Special Issue on Polar and Alpine

:21-36. 12 Microbiology. Canadian Journal of Microbiology 55:21-36.

13 Albertsen, M., Hugenholtz, P., Skarshewski, A., Nielsen, K.L., Tyson, G.W. and

2013. Genome sequences of rare, uncultured bacteria obtained by 14 Nielsen, P.H. 2013. Genome sequences of rare, uncultured bacteria obtained by

differential coverage binning of multiple metagenomes. Nature Biotechnology 15 differential coverage binning of multiple metagenomes. Nature Biotechnology

:533-538. 16 31:533-538.

2005. Characterization of a 17 Alloisio, N., Maréchal, J., Normand, P. and Berry, A. 2005. Characterization of a

ACN14a, and 18 gene locus containing squalene-hopene cyclase (shc) in alni ACN14a, and

homolog in Acidothermus cellulolyticus. Symbiosis (Rehovot), Balaban 19 anshc homolog in Acidothermus cellulolyticus. Symbiosis (Rehovot), Balaban

:83-90. 20 Publishers 39:83-90.

1990. Basic local 21 Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. 1990. Basic local

:403-410. 22 alignment search tool. Journal of Molecular Biology 215:403-410.

23 Amann, R.I., Binder, B.J., Olson, R.J., Chisholm, S.W., Devereux, R. and Stahl,

1990. Combination of 16S rRNA-targeted oligonucleotide probes with flow 24 D.A. 1990. Combination of 16S rRNA-targeted oligonucleotide probes with flow

cytometry for analyzing mixed microbial populations. Applied and Environmental 25 cytometry for analyzing mixed microbial populations. Applied and Environmental

:1919-1925. 26 Microbiology 56:1919-1925.

118

2001. The identification of microorganisms 1 Amann, R., Fuchs, B.M. and Behrens, S. 2001. The identification of microorganisms

:231-236. 2 by fluorescence in situ hybridisation. Current Opinion in Biotechnology 12:231-236.

3 An, M., Mou, S., Zhang, X., Ye, N., Zheng, Z., Cao, S., Xu, D., Fan, X., Wang, Y.

2013. Temperature regulates fatty acid desaturases at a transcriptional 4 and Miao, J. 2013. Temperature regulates fatty acid desaturases at a transcriptional

sp. 5 level and modulates the fatty acid profile in the Antarctic microalga Chlamydomonas sp.

:151-157. 6 ICE-L. Bioresource Technology 134:151-157.

7 Anantharaman, K., Brown, C.T., Burstein, D., Castelle, C.J., Probst, A.J., Thomas,

2016. Analysis of five complete genome 8 B.C., Williams, K.H. and Banfield, J.F. 2016. Analysis of five complete genome

sequences for members of the class Peribacteria in the recently recognized 9 sequences for members of the class Peribacteria in the recently recognized

:e1607. 10 Peregrinibacteria bacterial phylum. PeerJ 4:e1607.

2011. Distribution and abundance of soil fungi in 11 Arenz, B.E. and Blanchette, R.A. 2011. Distribution and abundance of soil fungi in

A ntarctica at sites on the Peninsula, Ross Sea Region and McMurdo Dry Valleys. Soil 12 Antarctica at sites on the Peninsula, Ross Sea Region and McMurdo Dry Valleys. Soil

:308-315. 13 Biology and Biochemistry 43:308-315.

1997. Fungi of the Windmill Islands, continental 14 Azmi, O.R. and Seppelt, R.D. 1997. Fungi of the Windmill Islands, continental

A ntarctica.Effect of temperature, pH and culture media on the growth of selected 15 Antarctica.Effect of temperature, pH and culture media on the growth of selected

16 microfungi. Polar Biology 18:128-134.

1998. The broad-scale distribution of microfungi in the 17 Azmi, O.R. and Seppelt, R.D. 1998. The broad-scale distribution of microfungi in the

:92-100. 18 windmill islands region, continental Antarctica. Polar Biology 19:92-100.

2008. Multiple Rubisco forms in proteobacteria: their 19 Badger, M.R. and Bek, E.J. 2008. Multiple Rubisco forms in proteobacteria: their

acquisition by the CBB cycle. Journal of 20 functional significance in relation to CO2 acquisition by the CBB cycle. Journal of

:1525-1541. 21 Expermental Botany59:1525-1541.

2014. Molecular 22 Bakermans, C., Skidmore, M.L., Douglas, S. and McKay, C.P. 2014. Molecular

characterization of bacteria from permafrost of the Taylor Valley, Antarctica. FEMS 23 characterization of bacteria from permafrost of the Taylor Valley, Antarctica. FEMS

331-346. 24 Microbiology Ecology89:331-346.

2012. Using network 25 Barberán, A., Bates, S.T., Casamayor, E.O. and Fierer, N. 2012. Using network

analysis to explore co-occurrence patterns in soil microbial communities. The ISME 26 analysis to explore co-occurrence patterns in soil microbial communities. The ISME

:343-351. 27 Journal 6:343-351.

119

1 Bargagli, R., Skotnicki, M., Marri, L., Pepi, M., Mackenzie, A. and Agnorelli, C.

2004. New record of moss and thermophilic bacteria species and physico-chemical 2 2004. New record of moss and thermophilic bacteria species and physico-chemical

properties of geothermal soils on the northwest slope of Mt. Melbourne (Antarctica). 3 properties of geothermal soils on the northwest slope of Mt. Melbourne (Antarctica).

:423-431. 4 Polar Biology 27:423-431.

2013. Bacterial adaptation to cold. 5 Barria, C., Malecki, M. and Arraiano, C. 2013. Bacterial adaptation to cold.

:2437-2443. 6 Microbiology 159:2437-2443.

2011. 7 Bates, S.T., Cropsey, G.W., Caporaso, J.G., Knight, R. and Fierer, N. 2011. B acterial communities associated with the lichen symbiosis. Appliedand Environmental 8 Bacterial communities associated with the lichen symbiosis. Appliedand Environmental

:1309-1314. 9 Microbiology 77:1309-1314.

2014. Polar soils Actinobacteria: a potential source of novel antibiotic 10 Benaud, N. 2014. Polar soils Actinobacteria: a potential source of novel antibiotic

secondary metabolites. Bachelor of Science (Honours), University of New Sourth 11 secondary metabolites. Bachelor of Science (Honours), University of New Sourth

W ales. 12 Wales.

13 Bengtsson‐Palme, J., Ryberg, M., Hartmann, M., Branco, S., Wang, Z., Godhe, A.,

2013. 14 Wit, P., Sánchez‐García, M., Ebersberger, I., Sousa, F. and Amend, A. 2013.

Im proved software detection and extraction of ITS1 and ITS2 from ribosomal ITS 15 Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS

sequences of fungi and other eukaryotes for analysis of environmental sequencing data. 16 sequences of fungi and other eukaryotes for analysis of environmental sequencing data.

:914-919. 17 Methods in Ecology and Evolution 4:914-919.

2014. Three 18 Berney, M., Greening, C., Hards, K., Collins, D. and Cook, G.M. 2014. Three

different [NiFe] hydrogenases confer metabolic flexibility in the 19 different [NiFe] hydrogenases confer metabolic flexibility in the obligate aerobe

:318-330. 20 Mycobacterium smegmatis. Environmental microbiology 16:318-330.

2007. Routine 21 Bertaux, J., Gloger, U., Schmid, M., Hartmann, A. and Scheu, S. 2007. Routine

hybridization in soil. Journal of Microbiological Methods 22 fluorescence in situ hybridization in soil. Journal of Microbiological Methods

:451-460. 23 69:451-460.

2000. Nutrient and thermal regime, microbial 24 Beyer, L., Bölter, M. and Seppelt, R.D. 2000. Nutrient and thermal regime, microbial

biomass, and vegetation of Antarctic soils in the Windmill Islands region of east 25 biomass, and vegetation of Antarctic soils in the Windmill Islands region of east

A ntarctica (Wilkes Land). Arctic, Antarctic, and Alpine Research 30-39. 26 Antarctica (Wilkes Land). Arctic, Antarctic, and Alpine Research 30-39.

27 Blin, K., Medema, M.H., Kazempour, D., Fischbach, M.A., Breitling, R., Takano, E.

120

2013. antiSMASH 2.0—a versatile platform for genome mining of 1 and Weber, T. 2013. antiSMASH 2.0—a versatile platform for genome mining of

secondary metabolite producers. Nucleic Acids Research gkt449. 2 secondary metabolite producers. Nucleic Acids Research gkt449.

2014. Trimmomatix: a flexible trimmer for 3 Bolger, A.M., Lohse, M. and Usadel, B. 2014. Trimmomatix: a flexible trimmer for

:2114-2120. 4 Illumina sequence data. Bioinformatics 30:2114-2120.

5 Bowman, J.P., McCammon, S.A. and Skerrat, J.H. 1997. Methylosphaera hansonii

gen. nov., sp. nov., a psychrophilic, group I methanotroph from Antarctic marine-salinity, 6 gen. nov., sp. nov., a psychrophilic, group I methanotroph from Antarctic marine-salinity,

:1451-1459. 7 meromictic lakes. Microbiology 143:1451-1459.

2009. Soil fungal community composition at Mars 8 Bridge, P.D. and Newsham, K.K. 2009. Soil fungal community composition at Mars

O asis, a southern maritime Antarctic site, assessed by PCR amplification and cloning. 9 Oasis, a southern maritime Antarctic site, assessed by PCR amplification and cloning.

:66-74. 10 Fungal Ecology 2:66-74.

1981. The ecology of sublithic terrestrial algae at the Vestfold Hills, 11 Broady, P.A. 1981. The ecology of sublithic terrestrial algae at the Vestfold Hills,

:231-240. 12 Antarctica. British Phycological Journal 16:231-240.

13 Brown, C.T., Hug, L.A., Thomas, B.C., Sharon, I., Castelle, C.J., Singh, A., Wilkins,

2015.Unusual biology 14 M.J., Wrighton, K.C., Williams, K.H. and Banfield, J.F. 2015.Unusual biology

:208-211. 15 across a group comprising more than 15% of domain Bacteria. Nature 523:208-211.

2000. Origin and 16 Burkins, M.B., Virginia, R.A., Chamberlain, C.P. and Wall, D.H. 2000. Origin and

:2377-2391. 17 distribution of soil organic matter in Taylor Valley, Antarctica. Ecology 81:2377-2391.

18 Busse, H.J., Denner, E.B., Buczolits, S., Salkinoja-Salonen, M., Bennasar, A. and

sp. nov. 19 Kämpfer, P. 2003. Sphingomonas aurantiaca sp. nov., Sphingomonas aerolata sp. nov.

sp. nov., air-and dustborne and Antarctic, orange-pigmented, 20 and Sphingomonas faeni sp. nov., air-and dustborne and Antarctic, orange-pigmented,

psychrotolerant bacteria, and emended description of the Sphingomonas. 21 psychrotolerant bacteria, and emended description of the genus Sphingomonas.

:1253-1260. 22 International Journal of Systematic and Evolutionary Microbiology 53:1253-1260.

2007. Ribulose bisphosphate 23 Caldwell, P.E., MacLean, M.R. and Norris, P.R. 2007. Ribulose bisphosphate species. 24 carboxylase activity and a Calvin cycle gene cluster in Sulfobacillus species.

:2231-2240. 25 Microbiology 153:2231-2240.

2014. Host-associated bacterial taxa from Chlorobi, 26 Camanocha, A. and Dewhirst, F.E. 2014. Host-associated bacterial taxa from Chlorobi,

C hloroflexi, GN02, Synergistetes, SR1, TM7, and WPS-2 Phyla/candidate divisions. 27 Chloroflexi, GN02, Synergistetes, SR1, TM7, and WPS-2 Phyla/candidate divisions.

121

. 1 Journal of Oral Microbiology 6.

2 Cantarel, B.L., Coutinho, P.M., Rancurel, C., Bernard, T., Lombard, V. and

2009. The Carbohydrate-Active EnZymes database (CAZy): an expert 3 Henrissat, B. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert

:D233-D238. 4 resource for glycogenomics. Nucleic Acids Research 37:D233-D238.

5 Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D.,

6 Costello, E.K., Fierer, N., Pena, A.G., Goodrich, J.K., Gordon, J.I. and Huttley,

2010. QIIME allows analysis of high-throughput community sequencing data. 7 G.A. 2010. QIIME allows analysis of high-throughput community sequencing data.

:335-336. 8 Nature Methods 7:335-336.

Evaluation of 9 de Cárcer, D., Denman, S., McSweeney, C., and Morrison, M. 2011. Evaluation of

subsampling-based normalization strategies for tagged high-throughput sequencing data 10 subsampling-based normalization strategies for tagged high-throughput sequencing data

: 8795-8798. 11 sets from gut . Applied and Environmental Microbiology 77: 8795-8798.

12 Carini, P., Marsden, Patrick., Leff, J., Morgan, E., Strickland, M., Fierer, N. 2016.

R elic DNA is abundant in soil and obscures estimates of soil microbial diversity. 13 Relic DNA is abundant in soil and obscures estimates of soil microbial diversity.

bioRxiv 043372; doi: http://dx.doi.org/10.1101/043372. 14 bioRxiv 043372; doi: http://dx.doi.org/10.1101/043372.

2010. On the rocks: the 15 Cary, S.C., McDonald, I.R., Barrett, J.E. and Cowan, D.A. 2010. On the rocks: the

:129-138. 16 microbiology of Antarctic Dry Valley soils. Nature Reviews Microbiology 8:129-138.

2013. 17 Chan, Y., Van Nostrand, J.D., Zhou, J., Pointing, S.B. and Farrell, R.L. 2013.

Functional ecology of an Antarctic dry valley. Proceedings of the National Academy of 18 Functional ecology of an Antarctic dry valley. Proceedings of the National Academy of

:8990-8995. 19 Sciences 110:8990-8995.

2000. Cold-adaptation of Antarctic microorganisms–possible 20 Chattopadhyay, M. 2000. Cold-adaptation of Antarctic microorganisms–possible

:223-224. 21 involvement of viable but nonculturable state. Polar Biology 23:223-224.

2001. Maintenance of membrane fluidity 22 Chattopadhyay, M. and Jagannadham, M. 2001. Maintenance of membrane fluidity

:386-388. 23 in Antarctic bacteria. Polar biology 24:386-388.

2009. DGGE 24 Chong, C.W., Annie Tan, G.Y., Wong, R.C.S., Riddle, M.J., Tan, I.K.P. 2009. DGGE

fingerprinting of bacteria in soils from eight ecologically different sites around Casey 25 fingerprinting of bacteria in soils from eight ecologically different sites around Casey

:853-860. 26 Station, Antarctica. Polar Biology 32:853-860.

2011. Assessment of soil 27 Chong, C.W., Convey, P., Pearce, D.A. and Tan,I.K.P. 2011. Assessment of soil

122

bacterial communities on Alexander Island (in the maritime and continental Antarctic 1 bacterial communities on Alexander Island (in the maritime and continental Antarctic

:387-399. 2 transitional zone). Polar Biology 35:387-399.

2010. 3 Chong, C.W., Pearce, D.A., Convey, P., Tan, G., Wong, R. and Tan, I.K.P. 2010.

H igh levels of spatial heterogeneity in the biodiversity of soil prokaryotes on Signy 4 High levels of spatial heterogeneity in the biodiversity of soil prokaryotes on Signy

:601-610. 5 Island, Antarctica. Soil Biology and Biochemistry 42:601-610.

2010. 6 Chu, H., Fierer, N., Lauber, C.L., Caporaso, J.G., Knight, R. and Grogan, P. 2010.

So il bacterial diversity in the Arctic is not fundamentally different from that found in 7 Soil bacterial diversity in the Arctic is not fundamentally different from that found in

:2998-3006. 8 other biomes. Environmental Microbiology 12:2998-3006.

2007. 9 Chun, J. Lee, J.-H., Jung, Y., Kim, M., Kim, S., Kim, B.K. and Lim, Y.-W. 2007.

E zTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal 10 EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal

R NA gene sequences. International Journal of Systematic and Evolutionary 11 RNA gene sequences. International Journal of Systematic and Evolutionary

:2259-2261. 12 Microbiology 57:2259-2261.

1993. Non-parametric multivariate analyses of changes in community 13 Clark, K.R. 1993. Non-parametric multivariate analyses of changes in community

:117-143. 14 structure. Australian Journal of Ecology 18:117-143.

2006. On resemblance measures 15 Clarke, K.R., Somerfield, P.J. and Chapman, M.G. 2006. On resemblance measures for ecological studies, including taxonomic dissimilarities and a zero-adjusted 16 for ecological studies, including taxonomic dissimilarities and a zero-adjusted

B ray–Curtis coefficient for denuded assemblages. Journal of Experimental Marine 17 Bray–Curtis coefficient for denuded assemblages. Journal of Experimental Marine

:55-80. 18 Biology and Ecology 330:55-80.

19 Cole, J.R., Wang, Q., Fish, J.A., Chai, B., McGarrell, D.M., Sun, Y., Brown, C.T.,

2013. Ribosomal Database Project: 20 Porras-Alfaro, A., Kuske, C.R. and Tiedje, J.M. 2013. Ribosomal Database Project:

data and tools for high throughput rRNA analysis. Nucleic Acids Research gkt1244. 21 data and tools for high throughput rRNA analysis. Nucleic Acids Research gkt1244.

. 1996. Soil microorganisms as controllers of atmospheric trace gases (H2, 22 Conrad, R. 1996. Soil microorganisms as controllers of atmospheric trace gases (H2,

609-640. 23 CO, CH4, OCS, N2O, and NO). Microbiological Reviews 60: 609-640.

2011. 24 Constant, P., Chowdhury, S.P., Hesse, L., Pratscher, J. and Conrad, R. 2011.

G enome data mining and soil survey for the novel group 5 [NiFe]-hydrogenase to 25 Genome data mining and soil survey for the novel group 5 [NiFe]-hydrogenase to

explore the diversity and ecological importance of presumptive high-affinity 26 explore the diversity and ecological importance of presumptive high-affinity

:6027-6035. 27 H2-oxidizing bacteria. Applied and Environmental Microbiology 77:6027-6035.

123

. Streptomycetes 1 Constant, P., Chowdhury, S.P., Pratscher, J. and Conrad, R. 2010. Streptomycetes

contributing to atmospheric molecular hydrogen soil uptake are widespread and encode 2 contributing to atmospheric molecular hydrogen soil uptake are widespread and encode

:821-829. 3 a putative high-affinity [NiFe]-hydrogenase. Environmental Microbiology 12:821-829.

D OI: 10.1111/j.1462-2920.2009.02130.x 4 DOI: 10.1111/j.1462-2920.2009.02130.x

1983. Kinetics and electron transport of soil 5 Conrad, R., Weber, M. and Seiler, W. 1983. Kinetics and electron transport of soil

hydrogenases catalyzing the oxidation of atmospheric hydrogen. Soil Biology and 6 hydrogenases catalyzing the oxidation of atmospheric hydrogen. Soil Biology and

:167-73. 7 Biochemistry 15:167-73.

8 Convey, P., Chown, S.L., Clarke, A., Barnes, D.K., Bokhorst, S., Cummings, V.,

2014. The spatial structure of 9 Ducklow, H.W., Frati, F., Green, T.A. and Gordon, S. 2014. The spatial structure of

:203-244. 10 Antarctic biodiversity. Ecological Monographs 84:203-244.

11 Costas, G., Amaya, M., Liu, Z., Tomsho, L.P., Schuster, S.C., Ward, D.M. and

12 Bryant, D.A. 2012. Complete genome of Candidatus Chloracidobacterium

, a chlorophyll-based photoheterotroph belonging to the phylum 13 thermophilum, a chlorophyll-based photoheterotroph belonging to the phylum

:177-190. 14 Acidobacteria. Environmental Microbiology 14:177-190.

2009. 15 Costello, E.K., Halloy, S.R.P., Reed, S.C., Sowell, P. and Schmidt, S.K. 2009. Fum arole-supported islands of biodiversity within a hyperarid, high-Elevation 16 Fumarole-supported islands of biodiversity within a hyperarid, high-Elevation

landscape on socompa volcano, Puna de Atacama andes. Applied Environmental 17 landscape on socompa volcano, Puna de Atacama andes. Applied Environmental

:735-747. 18 Microbiology 75:735-747.

2007. Molecular phylogenetic characterisation of high altitude soil 19 Costello, E.K. 2007. Molecular phylogenetic characterisation of high altitude soil

m icrobial communities and novel, uncultivated bacterial lineages. Doctor of Philosophy, 20 microbial communities and novel, uncultivated bacterial lineages. Doctor of Philosophy,

U niversity of Colorado. 21 University of Colorado.

2002. Antarctic Dry 22 Cowan, D.A., Russell, N.J., Mamais, A. and Sheppard, D.M. 2002. Antarctic Dry

V alley mineral soils contain unexpectedly high levels of microbial biomass. 23 Valley mineral soils contain unexpectedly high levels of microbial biomass.

:431-436. 24 Extremophiles 6:431-436.

25 Cowan, D., Sohm, J., Makhalanyane, T., Capone, D., Green, T., Cary, S. and Tuffin,

2011. Hypolithic communities: important nitrogen sources in Antarctic desert soils. 26 I. 2011. Hypolithic communities: important nitrogen sources in Antarctic desert soils.

:581-586. 27 Environmental Microbiology Reports 3:581-586.

124

2016. Microfluidic qPCR for Microbial Ecotoxicology in Soil: A Pilot 1 Crane, L.S. 2016. Microfluidic qPCR for Microbial Ecotoxicology in Soil: A Pilot

Study (Master of Philosophy), University of New Sourth Wales. 2 Study (Master of Philosophy), University of New Sourth Wales.

2006. The igraph software package for complex network 3 Csardi, G. and Nepusz, T. 2006. The igraph software package for complex network

:1-9. 4 research.InterJournal, Complex Systems 1695:1-9.

2014. 5 Darling, A.E., Jospin, G., Lowe, E., Matsen, I.V., Bik, H.M. and Eisen, J.A. 2014.

:1-28. 6 Phylosift: phylogenetic analysis of genomes and metagenomes. PeerJ 2:1-28.

2014. Some like it cold: 7 De Maayer, P., Anderson, D., Cary, C. and Cowan, D.A. 2014. Some like it cold: understanding the survival strategies of psychrophiles. EMBO Reports e201338170. 8 understanding the survival strategies of psychrophiles. EMBO Reports e201338170.

2000. Response of Antarctic soil bacterial assemblages to contamination by 9 Delille, D. 2000. Response of Antarctic soil bacterial assemblages to contamination by

:159-168. 10 diesel fuel and crude oil. Microbial Ecology 40:159-168.

11 DeSantis, T., Hugenholtz, P., Keller, K., Brodie, E., Larsen, N., Piceno, Y., Phan, R.

2006. NAST: a multiple sequence alignment server for 12 and Andersen, G.L. 2006. NAST: a multiple sequence alignment server for

:W394-W 399. 13 comparative analysis of 16S rRNA genes. Nucleic Acids Research 34:W394-W399.

14 DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller, K.,

2006. Greengenes, a 15 Huber, T., Dalevi, D., Hu, P. and Andersen, G.L. 2006. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. 16 chimera-checked 16S rRNA gene database and workbench compatible with ARB.

:5069-5072. 17 Applied and Environmental Microbiology 72:5069-5072.

18 Dewhirst, F.E., Klein, E.A., Thompson, E.C., Blanton, J.M., Chen, T., Milella, L.,

2012. The canine 19 Buckley, C.M., Davis, I.J., Bennett, M.L. and Marshall-Jones, Z.V. 2012. The canine

:e36067. 20 oral microbiome. PLoS One 7:e36067.

2010.Carotenoid pigmentation in 21 Dieser, M., Greenwood, M. and Foreman, C.M. 2010.Carotenoid pigmentation in

A ntarctic heterotrophic bacteria as a strategy to withstand environmental stresses. Arctic, 22 Antarctic heterotrophic bacteria as a strategy to withstand environmental stresses. Arctic,

:396-405. 23 Antarctic, and Alpine Research 42:396-405.

2013. 24 Doyle, S.M., Montross, S.N., Skidmore, M.L. and Christner, B.C. 2013.

C haracterizing microbial diversity and the potential for metabolic function at -15° C in 25 Characterizing microbial diversity and the potential for metabolic function at -15° C in

:1034-1053. 26 the Basal Ice of Taylor Glacier, Antarctica. Biology 2:1034-1053.

1972. The location of 27 Duckworth, M., Archibald, A.R. and Baddiley, J. 1972. The location of

125

168. Biochemical Journal 1 N-acetylgalactosamine in the walls of Bacillus subtilis 168. Biochemical Journal

:691-696. 2 130:691-696.

2012. Distribution and quantification of 3 Durso, L.M., Miller, D.N. and Wienhold, B.J. 2012. Distribution and quantification of

antibiotic resistant genes and bacteria across agricultural and non-agricultural 4 antibiotic resistant genes and bacteria across agricultural and non-agricultural

:e48325. 5 metagenomes. PLoS One 7:e48325.

2010. Search and clustering orders of magnitude faster than BLAST. 6 Edgar, R.C. 2010. Search and clustering orders of magnitude faster than BLAST.

: 2460-2461. 7 Bioinformatics 26: 2460-2461.

2011. UCH IME 8 Edgar, R.C., Haas, B.J., Clemente, J.C., Quince, C. and Knight, R. 2011. UCHIME

:2194-2200. 9 improves sensitivity and speed of chimera detection. Bioinformatics 27:2194-2200.

2008. Improved detection of soil microorganisms 10 Eickhorst, T. and Tippkötter, R. 2008. Improved detection of soil microorganisms

hybridization (FISH) and catalyzed reporter deposition 11 using fluorescence in situ hybridization (FISH) and catalyzed reporter deposition

:1883-1891. 12 (CARD-FISH). Soil Biology and Biochemistry 40:1883-1891.

1983. Organic osmoregulatory solutes in blue-green algae. Zeitschrift für 13 Erdmann, N. 1983. Organic osmoregulatory solutes in blue-green algae. Zeitschrift für

:147-155. 14 Pflanzenphysiologie 110:147-155.

1990. The endophytic fungal community in 15 Espinosa-garcia, F. and Langenheim, J. 1990. The endophytic fungal community in leaves of a coastal redwood population diversity and spatial patterns. New Phytologist 16 leaves of a coastal redwood population diversity and spatial patterns. New Phytologist

:89-97. 17 116:89-97.

2005. Microcolony cultivation on a soil 18 Ferrari, B.C., Binnerup, S.J. and Gillings, M. 2005. Microcolony cultivation on a soil

substrate membrane system selects for previously uncultured soil bacteria. Applied and 19 substrate membrane system selects for previously uncultured soil bacteria. Applied and

:8714-8720. 20 Environmental Microbiology 71:8714-8720.

2006. Catalyzed reporter 21 Ferrari, B.C., Tujula, N., Stoner, K. and Kjelleberg, S. 2006. Catalyzed reporter

hybridization allows for enrichment-independent 22 deposition-fluorescence in situ hybridization allows for enrichment-independent

detection of microcolony-forming soil bacteria. Applied and Environmental 23 detection of microcolony-forming soil bacteria. Applied and Environmental

:918-922. 24 Microbiology 72:918-922.

25 Ferrari, B.C., Bissett, A., Snape, I., Dorst, J., Palmer, A.S., Ji, M., Siciliano, S.D.,

2015. Geological connectivity drives 26 Stark, J.S., Winsley, T. and Brown, M.V. 2015. Geological connectivity drives

m icrobial community structure and connectivity in polar, terrestrial ecosystems. 27 microbial community structure and connectivity in polar, terrestrial ecosystems.

126

:1834-1849. 1 Environmental Microbiology 18:1834-1849.

2005. Assessment of soil 2 Fierer, N., Jackson, J.A., Vilgalys, R. and Jackson, R.B. 2005. Assessment of soil

m icrobial community structure by use of taxon-specific quantitative PCR 3 microbial community structure by use of taxon-specific quantitative PCR

:4117-4120. 4 assays.Applied and Environmental Microbiology 71:4117-4120.

2006. The diversity and biogeography of soil bacterial 5 Fierer, N. and Jackson, R.B. 2006. The diversity and biogeography of soil bacterial

com munities. Proceedings of the National Academy of Sciences of the United States of 6 communities. Proceedings of the National Academy of Sciences of the United States of

:626-631. 7 America 103:626-631.

2007. 8 Foreman, C.M., Sattler, B., Mikucki, J.A., Porazinska, D.L. and Priscu, J.C. 2007.

M etabolic activity and diversity of cryoconites in the Taylor Valley, Antarctica. Journal 9 Metabolic activity and diversity of cryoconites in the Taylor Valley, Antarctica. Journal

:G04S32. 10 of Geophysical Research: Biogeosciences 112:G04S32.

11 Fowler, D., Coyle, M., Skiba, U., Sutton, M.A., Cape, J.N., Reis, S., Sheppard, L.J.,

2013. The global nitrogen cycle in the 12 Jenkins, A., Grizzetti, B. and Galloway, J.N. 2013. The global nitrogen cycle in the

twenty-first century. Philosophical Transactions of the Royal Society of London B: 13 twenty-first century. Philosophical Transactions of the Royal Society of London B:

:20130164. 14 Biological Sciences 368:20130164.

1994. Measured properties of the Antarctic ice sheet 15 Fox, A.J., Paul, A. and Cooper, R. 1994. Measured properties of the Antarctic ice sheet

:201-206. 16 derived from the SCAR Antarctic digital database. Polar Record 30:201-206.

1982. Endolithic microorganisms in the Antarctic cold desert. Science 17 Friedmann, E.I. 1982. Endolithic microorganisms in the Antarctic cold desert. Science

:1045-1053. 18 215:1045-1053.

2006. Stable-isotope probing of DNA: insights into the function of 19 Friedrich, M.W. 2006. Stable-isotope probing of DNA: insights into the function of

uncultivated microorganisms from isotopically labeled metagenomes. Current Opinion 20 uncultivated microorganisms from isotopically labeled metagenomes. Current Opinion

:59-66. 21 in Biotechnology 17:59-66.

22 Fuchs, B.M., Wallner, G., Beisker, W., Schwippl, I., Ludwig, W. and Amann, R.

accessibility of Escherichia coli 16S rRNA 23 1998. Flow cytometric analysis of the in situ accessibility of Escherichia coli 16S rRNA

for fluorescently labeled oligonucleotide probes. Applied and Environmental 24 for fluorescently labeled oligonucleotide probes. Applied and Environmental

:4973-4982. 25 Microbiology 64:4973-4982.

accessibility of 26 Fuchs, B.M., Syutsubo, K., Ludwig, W. and Amann, R. 2001. In situ accessibility of

23S rRNA to fluorescently labeled oligonucleotide probes. Applied and 27 Escherichia coli 23S rRNA to fluorescently labeled oligonucleotide probes. Applied and

127

:961-968. 1 Environmental Microbiology 67:961-968.

1970. The -alga association in the 2 Galun, M., Paran, N. and Ben-shaul, Y. 1970. The fungus-alga association in the

:599-603. 3 Lecanoraceae: an ultrastructural study. New Phytologist 69:599-603.

1993. ITS primers with enhanced specificity for 4 Gardes, M. and Bruns, T.D. 1993. ITS primers with enhanced specificity for

basidiomycetes-application to the identification of mycorrhizae and rusts. Molecular 5 basidiomycetes-application to the identification of mycorrhizae and rusts. Molecular

:113-118. 6 Ecology2:113-118.

2007. The function of terpene natural products in 7 Gershenzon, J. and Dudareva, N. 2007. The function of terpene natural products in

:408-414. 8 the natural world. Nature Chemical Biology 3:408-414.

2010. Production of antibiotics and enzymes by soil microorganisms from 9 Gesheva, V. 2010. Production of antibiotics and enzymes by soil microorganisms from

:1351-1357. 10 the windmill islands region, Wilkes Land, East Antarctica. Polar Biology 33:1351-1357.

2004. Distribution of RuBisCO genotypes 11 Giri, B.J., Bano, N. and Hollibaugh, J.T. 2004. Distribution of RuBisCO genotypes

along a redox gradient in Mono Lake, California. Applied and Environmental 12 along a redox gradient in Mono Lake, California. Applied and Environmental

:3443-3448. 13 Microbiology 70:3443-3448.

1957. The hydrogen content of atmospheric air at ground 14 Glueckauf, E. and Kitt, G. 1957. The hydrogen content of atmospheric air at ground

:522-528. 15 level. Quarterly Journal of the Royal Meteorological Society 83:522-528. G oodfellow M et al (2001) Bergey's manual of systematic bacteriology vol 1, 2nd ed. 16 Goodfellow M et al (2001) Bergey's manual of systematic bacteriology vol 1, 2nd ed.

edn. Springer, New York 17 edn. Springer, New York

1993. Holocene deglaciation, sea-level change, and the emergence of the 18 Goodwin, I.D. 1993. Holocene deglaciation, sea-level change, and the emergence of the

:55-69. 19 Windmill Islands, Budd Coast, Antarctica. Quaternary Research 40:55-69.

1955. The Concentration of Carbon Dioxide in the 20 Grandjean, J. and Goody, R. 1955. The Concentration of Carbon Dioxide in the

:548. 21 Atmosphere of Mars. The Astrophysical Journal 121:548.

22 Grasby, S.E., Richards, B.C., Sharp, C.E., Brady, A.L., Jones, G.M., Dunfield, P.F.

2013. The Paint Pots, Kootenay National Park, Canada—a 23 and Williamson, M.C. 2013. The Paint Pots, Kootenay National Park, Canada—a

:94-108. 24 natural acid spring analogue for Mars. Canadian Journal of Earth Sciences 50:94-108.

2014a. A soil 25 Greening, C., Berney, M., Hards, K., Cook, G.M. and Conrad, R. 2014a. A soil

using two membrane-associated, 26 actinobacterium scavenges atmospheric H2 using two membrane-associated, oxygen-dependent [NiFe] hydrogenases. Proceedings of the National Academy of 27 oxygen-dependent [NiFe] hydrogenases. Proceedings of the National Academy of

128

:4257-4261. 1 Sciences 111:4257-4261.

2014b. 2 Greening, C., Villas-Bôas, S.G., Robson, J.R., Berney, M. and Cook, G.M. 2014b.

is enhanced by co-metabolism of 3 The growth and survival of Mycobacterium smegmatis is enhanced by co-metabolism of

:e103034. 4 atmospheric H 2. PloS one 9:e103034.

5 Greening, C., Biswas, A., Carere, C.R., Jackson, C.J., Taylor, M.C., Stott, M.B.,

2015a. Genomic and metagenomic surveys of 6 Cook, G.M. and Morales, S.E. 2015a. Genomic and metagenomic surveys of

is a widely utilised energy source for microbial 7 hydrogenase distribution indicate H2 is a widely utilised energy source for microbial , 761–777. 8 growth and survival. The ISME Journal 10, 761–777.

9 Greening, C., Constant, P., Hards, K., Morales, S.E., Oakeshott, J.G., Russell, R.J.,

2015b. Atmospheric hydrogen 10 Taylor, M.C., Berney, M., Conrad, R. and Cook, G.M. 2015b. Atmospheric hydrogen

scavenging: from enzymes to ecosystems. Applied and Environmental Microbiology 11 scavenging: from enzymes to ecosystems. Applied and Environmental Microbiology

:1190-1199. 12 81:1190-1199.

13 Gregorich, E., Hopkins, D., Elberling, B., Sparrow, A., Novis, P., Greenfield, L. and

O from lakeshore soils in an Antarctic 14 Rochette, P. 2006. Emission of CO2, CH4 and N2O from lakeshore soils in an Antarctic :3120-3129. 15 dry valley. Soil Biology and Biochemistry 38:3120-3129.

fixation and C1 16 Grostern, A. and Alvarez-Cohen, L. 2013. RubisCO-based CO2 fixation and C1 CB1190. 17 metabolism in the actinobacterium Pseudonocardia dioxanivorans CB1190.

:3040-3053. 18 Environmental Microbiology 15:3040-3053.

2009. 19 Grzesiak, J., Zmuda-Baranowska, M., Borsuk, P. and Zdanowski, M. 2009.

M icrobial community at the front of Ecology Glacier (King George Island, Antarctica): 20 Microbial community at the front of Ecology Glacier (King George Island, Antarctica):

:37-47. 21 initial observations. Polish Polar Research 30:37-47.

2008. Identification of conditions underlying production 22 Guttman, L. and van Rijn, J. 2008. Identification of conditions underlying production

:85-91. 23 of geosmin and 2-methylisoborneol in a recirculating system. Aquaculture 279:85-91.

:4-4. 24 Harrell Jr, F.E. 2008. Hmisc: harrell miscellaneous. R package version 3:4-4.

2012. Validating potential toxicity assays 25 Harvey, A.N., Snape, I. and Siciliano, S.D. 2012. Validating potential toxicity assays

to assess petroleum hydrocarbon toxicity in polar soil. Environmental Toxicology and 26 to assess petroleum hydrocarbon toxicity in polar soil. Environmental Toxicology and

:402-407. 27 Chemistry 31:402-407.

129

1 He, H., Shen, B., Petersen, P.J., Weiss, W.J., Yang, H.Y., Wang, T.-Z., Dushin, R.G.,

2004. Mannopeptimycin esters and carbonates, potent 2 Koehn, F.E. and Carter, G.T. 2004. Mannopeptimycin esters and carbonates, potent

antibiotic agents against drug-resistant bacteria. Bioorganic & Medicinal Chemistry 3 antibiotic agents against drug-resistant bacteria. Bioorganic & Medicinal Chemistry

:279-282. 4 Letters 14:279-282.

1987. Soil biological processes in the North-and South. 5 Heal, O. and Block, W. 1987. Soil biological processes in the North-and South.

E cological Bulletins 47-57. 6 Ecological Bulletins 47-57.

2006. Comets, asteroids, meteorites, and the origin of the biosphere. 7 Hoover, R.B. 2006. Comets, asteroids, meteorites, and the origin of the biosphere.

:63090J. 8 Proceedings of SPIE 6309:63090J.

1989. Nitrogen dynamics in 9 Howard-Williams, C., Priscu, J.C. and Vincent, W.F. 1989. Nitrogen dynamics in

two Antarctic streams. High Latitude Limnology, Springer: 51-61. 10 two Antarctic streams. High Latitude Limnology, Springer: 51-61.

11 Howe, A.C., Jansson, J.K., Malfatti, S.A., Tringe, S.G., Tiedje, J.M. and Brown,

2014. Tackling soil diversity with the assembly of large, complex metagenomes. 12 C.T. 2014. Tackling soil diversity with the assembly of large, complex metagenomes.

:4904-4909. 13 Proceedings of the National Academy of Sciences 111:4904-4909.

1998. Protein 14 Hu, Y.M., Butcher, P.D., Sole, K., Mitchison, D. and Coates, A. 1998. Protein

synthesis is shutdown in dormant Mycobacterium tuberculosis and is reversed by 15 synthesis is shutdown in dormant Mycobacterium tuberculosis and is reversed by

:139-145. 16 oxygen or heat shock. FEMS Microbiology Letters 158:139-145.

17 Hua, Z.-S., Han, Y.-J., Chen, L.-X., Liu, J., Hu, M., Li, S.-J., Kuang, J.-L., Chain,

2014. Ecological roles of dominant and rare 18 P.S., Huang, L.-N. and Shu, W.-S. 2014. Ecological roles of dominant and rare

prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics. 19 prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics.

:1280-1294. 20 The ISME Journal 9:1280-1294.

21 Hug, L.A., Baker, B.J., Anantharaman, K., Brown, C.T., Probst, A.J., Castelle, C.J.,

2016. A new view of the 22 Butterfield, C.N., Hernsdorf, A.W., Amano, Y. and Ise, K. 2016. A new view of the

:16048. 23 tree of life. Nature Microbiology 1:16048.

1998. Impact of culture-independent 24 Hugenholtz, P., Goebel, B.M. and Pace, N.R. 1998. Impact of culture-independent

studies on the emerging phylogenetic view of bacterial diversity. Journal of 25 studies on the emerging phylogenetic view of bacterial diversity. Journal of

:4765-4774. 26 Bacteriology 180:4765-4774.

2001. 27 Hugenholtz, P., Tyson, G.W., Webb, R.I., Wagner, A.M. and Blackall, L.L. 2001.

130

Investigation of candidate division TM7, a recently recognized major lineage of the 1 Investigation of candidate division TM7, a recently recognized major lineage of the

dom ain Bacteria with no known pure-culture representatives. Applied and 2 domain Bacteria with no known pure-culture representatives. Applied and

:411-419. 3 Environmental Microbiology 67:411-419.

2010. Accidental transfer of 4 Hughes, K., Convey, P., Maslen, N. and Smith, R. 2010. Accidental transfer of

non-native soil organisms into Antarctica on construction vehicles. Biological Invasions 5 non-native soil organisms into Antarctica on construction vehicles. Biological Invasions

:875-891. 6 12:875-891.

2007. 7 Huson, D.H., Richter, D.C., Rausch, C., Dezulian, T., Franz, M. and Rupp, R. 2007. D endroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics 8 Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics

:1. 9 8:1.

10 Imelfort, M., Parks, D.H., Woodcroft, B.J., Dennis, P.D., Hugenholtz, P. and Tyson,

2014. GroopM : an automated tool for the recovery of population genomes from 11 G.W. 2014. GroopM: an automated tool for the recovery of population genomes from

:e603. 12 related metagenomes. PeerJ 2:e603.

13 Iverson, V., Morris, R.M., Frazar, C.D., Berthiaume, C.T., Morales, R.L. and

2012. Untangling genomes from metagenomes: revealing an 14 Armbrust, E.V. 2012. Untangling genomes from metagenomes: revealing an

:587-590. 15 uncultured class of marine Euryarchaeota. Science 335:587-590.

16 Ji, M., van Dorst, J., Bissett, A., Brown, M.V., Palmer, A.S., Snape, I., Siciliano, S.D.

2015. Microbial diversity at Mitchell Peninsula, Eastern Antarctica: 17 and Ferrari, B.C. 2015. Microbial diversity at Mitchell Peninsula, Eastern Antarctica:

:, 237-249. 18 a potential biodiversity “hotspot”. Polar Biology 39:, 237-249.

19 John, D.E., Wang, Z.A., Liu, X., Byrne, R.H., Corredor, J.E., López, J.M., Cabrera,

Phytoplankton carbon fixation 20 A., Bronk, D.A., Tabita, F.R. and Paul, J.H., 2007. Phytoplankton carbon fixation

gene (RuBisCO) transcripts and air-sea CO2 flux in the Mississippi River plume. The 21 gene (RuBisCO) transcripts and air-sea CO2 flux in the Mississippi River plume. The

517-531. 22 ISME journal 1:517-531.

2013. Bacterial cell-wall recycling. 23 Johnson, J.W., Fisher, J.F. and Mobashery, S. 2013. Bacterial cell-wall recycling.

:54-75. 24 Annals of the New York Academy of Sciences 1277:54-75.

25 Joshi, G.S., Romagnoli, S., VerBerkmoes, N.C., Hettich, R.L., Pelletier, D. and

26 Tabita, F.R. 2009. Differential accumulation of form I RubisCO in Rhodopseudomonas

CGA010 under photoheterotrophic growth conditions with reduced carbon 27 palustris CGA010 under photoheterotrophic growth conditions with reduced carbon

131

:4243-4250. 1 sources. Journal of Bacteriology 191:4243-4250.

2 Jung, J., Yeom, J., Kim, J., Han, J., Lim, H.S., Park, H., Hyun, S. and Park, W.

2011. Change in gene abundance in the nitrogen biogeochemical cycle with temperature 3 2011. Change in gene abundance in the nitrogen biogeochemical cycle with temperature

:1018-1026. 4 and nitrogen addition in Antarctic soils. Research in Microbiology 162:1018-1026.

1960. Knall-gas reaction-linked fixation of 5 Kanai, R., Miyachi, S. and Takamiya A. 1960. Knall-gas reaction-linked fixation of

:873-875. 6 labelled carbon dioxide in an autotrophic Streptomyces. Nature 188:873-875.

2000. KEGG: Kyoto Encyclopedia of Genes and Genomes. 7 Kanehisa, M. and Goto, S. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes.

:27-30. 8 Nucleic Acids Research 28:27-30.

2015. MetaBAT, an efficient tool for 9 Kang, D.D., Froula, J., Egan, R. and Wang, Z. 2015. MetaBAT, an efficient tool for

accurately reconstructing single genomes from complex microbial communities. PeerJ 10 accurately reconstructing single genomes from complex microbial communities. PeerJ

:1-15. 11 3:1-15.

12 Kantor, R.S., Wrighton, K.C., Handley, K.M., Sharon, I., Hug, L.A., Castelle, C.J.,

2013. Small genomes and sparse metabolisms of 13 Thomas, B.C. and Banfield, J.F. 2013. Small genomes and sparse metabolisms of

:e00708-00713. 14 sediment-associated bacteria from four candidate phyla. MBio 4:e00708-00713.

2008. Recent developments in the MAFFT multiple sequence 15 Katoh, K. and Toh, H. 2008. Recent developments in the MAFFT multiple sequence

:286-298. 16 alignment program. Briefings in Bioinformatics 9:286-298.

17 Kavamura, V.N., Taketani, R.G., Lançoni, M.D., Andreote, F.D., Mendes, R. and de

2013. Water regime influences bulk soil and rhizosphere of Cereus jamacaru 18 Melo, I.S. 2013. Water regime influences bulk soil and rhizosphere of Cereus jamacaru

:e73606. 19 bacterial communities in the Brazilian Caatinga biome. PloS One 8:e73606.

20 Kelly, L.C., Cockell, C.S., Piceno, Y.M., Andersen, G.L., Thorsteinsson, T. and

2010. Bacterial Diversity of Weathered Terrestrial Icelandic Volcanic 21 Marteinsson, V. 2010. Bacterial Diversity of Weathered Terrestrial Icelandic Volcanic

:740-752. 22 Glasses.Microbial Ecology 60:740-752.

2011. Evaluation of different partial 16S rRNA 23 Kim, M., Morrison, M. and Yu, Z. 2011. Evaluation of different partial 16S rRNA gene sequence regions for phylogenetic analysis of microbiomes. Journal of 24 gene sequence regions for phylogenetic analysis of microbiomes. Journal of

:81-87. 25 Microbiological Methods 84:81-87.

2012. 26 Kim, O.S., Chae, N., Lim, H.S., Cho, A., Kim, J.H., Hong, S.G. and Oh, J. 2012.

B acterial diversity in ornithogenic soils compared to mineral soils on King George 27 Bacterial diversity in ornithogenic soils compared to mineral soils on King George

132

:1081-1085. 1 Island, Antarctica. Journal of Microbiology 50:1081-1085.

2011. Compatible solute biosynthesis in Cyanobacteria. 2 Klähn, S. and Hagemann, M. 2011. Compatible solute biosynthesis in Cyanobacteria.

:551-562. 3 Environmental Microbiology 13:551-562.

4 Klausen, C., Nicolaisen, M.H., Strobel, B.W., Warnecke, F., Nielsen, J.L. and

2005. Abundance of actinobacteria and production of geosmin and 5 Jørgensen, N.O. 2005. Abundance of actinobacteria and production of geosmin and

2-m ethylisoborneol in Danish streams and fish ponds. FEMS Microbiology Ecology 6 2-methylisoborneol in Danish streams and fish ponds. FEMS Microbiology Ecology

:265-278. 7 52:265-278.

2008. Effect of exogenous extracellular 8 Knowles, E.J. and Castenholz, R.W. 2008. Effect of exogenous extracellular

polysaccharides on the desiccation and freezing tolerance of rock-inhabiting 9 polysaccharides on the desiccation and freezing tolerance of rock-inhabiting

:261-270. 10 phototrophic microorganisms. FEMS Microbiology Ecology 66:261-270.

11 Kõljalg, U., Larsson, K.H., Abarenkov, K., Nilsson, R.H., Alexander, I.J.,

12 Eberhardt, U., Erland, S., Høiland, K., Kjøller, R., Larsson, E. and Pennanen, T.

2005. UNITE: a database providing web-based methods for the molecular identification 13 2005. UNITE: a database providing web-based methods for the molecular identification

:1063-1068. 14 of ectomycorrhizal fungi. New Phytologist 166:1063-1068.

2012. 15 Kong, W., Dolhi, J.M., Chiuchiolo, A., Priscu, J. and Morgan-Kiss, R.M. 2012. ) in a perennially ice-covered Antarctic lake. FEMS 16 Evidence of form II RubisCO (cbbM) in a perennially ice-covered Antarctic lake. FEMS

491-500. 17 microbiology ecology 82:491-500.

2001. Detection of molecular hydrogen in the 18 Krasnopolsky, V.A. and Feldman, P.D. 2001. Detection of molecular hydrogen in the

:1914-1917. 19 atmosphere of Mars. Science 294:1914-1917.

2015. Diversity 20 Kudinova, A., Lysak, L., Lapygina, E., Soina, V. and Mergelov, N. 2015. Diversity

and viability of prokaryotes in primitive soils of the Larsemann oasis (East Antarctica). 21 and viability of prokaryotes in primitive soils of the Larsemann oasis (East Antarctica).

:92-97. 22 Biology Bulletin 42:92-97.

23 Kudinova, A., Lysak, L., Soina, V., Mergelov, N., Dolgikh, A. and Shorkunov, I. 2015. Bacterial communities in the soils of cryptogamic barrens of East Antarctica (the 24 2015. Bacterial communities in the soils of cryptogamic barrens of East Antarctica (the

:276-287. 25 Larsemann Hills and Thala Hills oases). Eurasian Soil Science 48:276-287.

26 Kuhns, L.G., Benoit, S.L., Bayyareddy, K., Johnson, D., Orlando, R., Evans, A.L.,

2016. Carbon Fixation Driven by Molecular Hydrogen 27 Waldrop, G.L. and Maier, R.J. 2016. Carbon Fixation Driven by Molecular Hydrogen

133

. Journal 1 Results in Chemolithoautotrophically Enhanced Growth of Helicobacter pylori. Journal

:1423-1428. 2 of Bacteriology 198:1423-1428.

2008. MEGA: a biologist-centric 3 Kumar, S., Nei, M., Dudley, J. and Tamura, K. 2008. MEGA: a biologist-centric

software for evolutionary analysis of DNA and protein sequences. Briefings in 4 software for evolutionary analysis of DNA and protein sequences. Briefings in

:299-306. 5 Bioinformatics 9:299-306.

2011. 6 Lacap, D.C., Warren-Rhodes, K.A., McKay, C.P. and Pointing, S.B. 2011.

C yanobacteria and chloroflexi-dominated hypolithic colonization of quartz at the 7 Cyanobacteria and chloroflexi-dominated hypolithic colonization of quartz at the

:31-38. 8 hyper-arid core of the Atacama Desert, Chile. Extremophiles 15:31-38.

1991. 16S/23S rRNA sequencing. Nucleic Acid Techniques in Bacterial 9 Lane, D. 1991. 16S/23S rRNA sequencing. Nucleic Acid Techniques in Bacterial

S ystematics 125-175. 10 Systematics 125-175.

11 Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A.,

12 McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D.,

2007. Clustal W and Clustal X version 2.0. 13 Gibson, T.J. and Higgins, D.G. 2007. Clustal W and Clustal X version 2.0.

:2947-2948. 14 Bioinformatics 23:2947-2948.

2007. Single-cell genomic sequencing using multiple displacement 15 Lasken, R.S. 2007. Single-cell genomic sequencing using multiple displacement

:510-516. 16 amplification. Current Opinion in Microbiology 10:510-516.

2009. Pyrosequencing-based 17 Lauber, C.L., Hamady, M., Knight, R. and Fierer, N. 2009. Pyrosequencing-based

assessment of soil pH as a predictor of soil bacterial community structure at the 18 assessment of soil pH as a predictor of soil bacterial community structure at the

:5111-5120. 19 continental scale. Applied and Environmental Microbiology 75:5111-5120.

20 Lauro, F.M., DeMaere, M.Z., Yau, S., Brown, M.V., Ng, C., Wilkins, D., Raftery,

2011. An 21 M.J., Gibson, J.A., Andrews-Pfannkoch, C., Lewis, M. and Hoffman, J.M. 2011. An

integrative study of a meromictic lake ecosystem in Antarctica. The ISME journal 22 integrative study of a meromictic lake ecosystem in Antarctica. The ISME journal

:879-895. 23 5:879-895.

2012. The 24 Lee, C.K., Barbier, B.A., Bottos, E.M., McDonald, I.R. and Cary, S.C. 2012. The

inter-valley soil comparative survey: the ecology of Dry Valley edaphic microbial 25 inter-valley soil comparative survey: the ecology of Dry Valley edaphic microbial

: 1046-1057. 26 communities. The ISME Journal 6: 1046-1057.

27 Lee, N., Nielsen, P.H., Andreasen, K.H., Juretschko, S., Nielsen, J.L., Schleifer,

134

hybridization and 1 K.-H.and Wagner, M. 1999. Combination of fluorescent in situ hybridization and

m icroautoradiography—a new tool for structure-function analyses in microbial ecology. 2 microautoradiography—a new tool for structure-function analyses in microbial ecology.

:1289-1297. 3 Applied and Environmental Microbiology 65:1289-1297.

2001. Vegetation abundance and diversity in relation to 4 Leishman, M.R. and Wild, C. 2001. Vegetation abundance and diversity in relation to

soil nutrients and soil water content in Vestfold Hills, East Antarctica. Antarctic Science 5 soil nutrients and soil water content in Vestfold Hills, East Antarctica. Antarctic Science

:126-134. 6 13:126-134.

7 Lesaulnier, C., Papamichail, D., McCorkle, S., Ollivier, B., Skiena, S., Taghavi, S.,

affects soil microbial 8 Zak, D. and Van Der Lelie, D. 2008. Elevated atmospheric CO2 affects soil microbial :926-941. 9 diversity associated with trembling aspen. Environmental Microbiology. 10:926-941.

2011. Interactive tree of life v2: online annotation and display 10 Letunic, I. and Bork, P. 2011. Interactive tree of life v2: online annotation and display

:475-478. 11 of phylogenetic trees made easy. Nucleic Acids Research 39:475-478.

2009. Methane and life on Mars. SPIE Optical 12 Levin, G.V. and Straat, P.A. 2009. Methane and life on Mars. SPIE Optical

E ngineering Applications, International Society for Optics and Photonics 13 Engineering Applications, International Society for Optics and Photonics

74410D -74410D. 14 74410D-74410D.

1990. Snow algae of the Windmill Islands, continental 15 Ling, H. and Seppelt, R. 1990. Snow algae of the Windmill Islands, continental (Zygnematales, Chlorophyta) the alga of grey snow. 16 Antarctica Mesotaenium berggrenii (Zygnematales, Chlorophyta) the alga of grey snow.

:143-148. 17 Antarctic Science 2:143-148.

new insights into the 18 Liot, Q. and Constant, P. 2015. Breathing air to save energy–new insights into the

. 19 ecophysiological role of high-affinity [NiFe]-hydrogenase in Streptomyces avermitilis.

: 47-59. 20 MicrobiologyOpen 5: 47-59.

21 Logan, N., Lebbe, L., Hoste, B., Goris, J., Forsyth, G., Heyndrickx, M., Murray, B.,

2000. Aerobic -forming 22 Syme, N., Wynn-Williams, D. and De Vos, P. 2000. Aerobic endospore-forming

bacteria from geothermal environments in northern Victoria Land, Antarctica, and 23 bacteria from geothermal environments in northern Victoria Land, Antarctica, and

24 Candlemas Island, South Sandwich archipelago, with the proposal of Bacillus fumarioli

sp. nov. International Journal of Systematic and Evolutionary Microbiology 25 sp. nov. International Journal of Systematic and Evolutionary Microbiology

:1741-1753. 26 50:1741-1753.

27 Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P.M. and Henrissat, B.

135

2014. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids 1 2014. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids

:490-495. 2 Research 42:490-495.

3 Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Buchner, A., Lai, T.,

2004. ARB: a software environment for sequence 4 Steppi, S., Jobb, G. and Förster, W. 2004. ARB: a software environment for sequence

:1363-1371. 5 data. Nucleic Acids Research 32:1363-1371.

2000. Nitrogen loss through denitrification in a soil 6 Luo, J., Tillman, R. and Ball, P. 2000. Nitrogen loss through denitrification in a soil

:497-509. 7 under pasture in New Zealand. Soil Biology and Biochemistry 32:497-509.

2014. 8 Lynch, R.C., Darcy, J.L., Kane, N.C., Nemergut, D.R. and Schmidt, S.K. 2014.

M etagenomic evidence for metabolism of trace atmospheric gases by high-elevation 9 Metagenomic evidence for metabolism of trace atmospheric gases by high-elevation

:698. 10 desert Actinobacteria. Frontiers in Microbiology 5:698.

1974. Geothermal activity in Victoria Land, 11 Lyon, G.L. and Giggenbach, W.F. 1974. Geothermal activity in Victoria Land,

:511-521. 12 Antarctica. New Zealand Journal of Geology and Geophysics 17:511-521.

2013. Ex-situ enzyme 13 Ma, D., Zhu, R., Ding, W., Shen, C., Chu, H. and Lin, X. 2013. Ex-situ enzyme

activity and bacterial community diversity through soil depth profiles in penguin and 14 activity and bacterial community diversity through soil depth profiles in penguin and

:1347-1361. 15 seal colonies on Vestfold Hills, East Antarctica. Polar Biology 36:1347-1361.

2014. 16 Magalhães, C.M., Machado, A., Frank-Fahle, B., Lee, C.K. and Cary, S.C. 2014.

T he ecological dichotomy of ammonia-oxidizing archaea and bacteria in the hyper-arid 17 The ecological dichotomy of ammonia-oxidizing archaea and bacteria in the hyper-arid

515. 18 soils of the Antarctic Dry Valleys.Frontiers in Microbiology 5:515.

19 Makhalanyane, T.P., Valverde, A., Birkeland, N.K., Cary, S.C., Tuffin, I.M. and

2013. Evidence for successional development in Antarctic hypolithic 20 Cowan, D.A. 2013. Evidence for successional development in Antarctic hypolithic

:2080-2090. 21 bacterial communities. The ISME Journal 7:2080-2090.

22 Makhalanyane, T.P., Valverde, A., Velázquez, D., Gunnigle, E., Van Goethem,

2015. Ecology and biogeochemistry of 23 M.W., Quesada, A. and Cowan, D.A. 2015. Ecology and biogeochemistry of cyanobacteria in soils, permafrost, aquatic and cryptic polar habitats. Biodiversity and 24 cyanobacteria in soils, permafrost, aquatic and cryptic polar habitats. Biodiversity and

:819-840. 25 Conservation 24:819-840.

26 Marcy, Y., Ouverney, C., Bik, E.M., Lösekann, T., Ivanova, N., Martin, H.G., Szeto,

2007. Dissecting biological “dark 27 E., Platt, D., Hugenholtz, P. and Relman, D.A. 2007. Dissecting biological “dark

136

m atter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from 1 matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from

:11889-11894. 2 the human mouth. Proceedings of the National Academy of Sciences 104:11889-11894.

2008. Isorenieratene biosynthesis in 3 Maresca, J.A., Romberger, S.P. and Bryant, D.A. 2008. Isorenieratene biosynthesis in

green sulfur bacteria requires the cooperative actions of two carotenoid cyclases. 4 green sulfur bacteria requires the cooperative actions of two carotenoid cyclases.

:6384-6391. 5 Journal of Bacteriology 190:6384-6391.

6 Markowitz, V.M., Chen, I.-M.A., Palaniappan, K., Chu, K., Szeto, E., Grechkin, Y.,

7 Ratner, A., Jacob, B., Huang, J., Williams, P., Huntemann, M., Anderson, I.,

2012. IMG: the integrated 8 Mavromatis, K., Ivanova, N.N. and Kyrpides, N.C. 2012. IMG: the integrated

m icrobial genomes database and comparative analysis system. Nucleic Acids Research 9 microbial genomes database and comparative analysis system. Nucleic Acids Research

:115-122. 10 40:115-122.

11 Martiny, J.B.H., Bohannan, B.J., Brown, J.H., Colwell, R.K., Fuhrman, J.A.,

2006. 12 Green, J.L., Horner-Devine, M.C., Kane, M., Krumins, J.A. and Kuske, C.R. 2006.

M icrobial biogeography: putting microorganisms on the map. Nature Reviews 13 : putting microorganisms on the map. Nature Reviews

:102-112. 14 Microbiology 4:102-112.

2010. What 15 Mataloni, G., Garraza, G.G., Bölter, M., Convey, P. and Fermani, P. 2010. What shapes edaphic communities in mineral and ornithogenic soils of Cierva Point, Antarctic 16 shapes edaphic communities in mineral and ornithogenic soils of Cierva Point, Antarctic

:405-419. 17 Peninsula. Polar Science 4:405-419.

18 McDonald, D., Price, M.N., Goodrich, J., Nawrocki, E.P., DeSantis, T.Z., Probst, A.,

2012. An improved Greengenes 19 Andersen, G.L., Knight, R. and Hugenholtz, P. 2012. An improved Greengenes

taxonomy w ith explicit ranks for ecological and evolutionary analyses of bacteria and 20 taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and

:610-618. 21 archaea. The ISME Journal 6:610-618.

1999a. species from 22 McRae, C.F., Hocking, A.D. and Seppelt, R.D. 1999a. Penicillium species from

terrestrial habitats in the Windmill Islands, East Antarctica, including a new species, 23 terrestrial habitats in the Windmill Islands, East Antarctica, including a new species,

:97-111. 24 Penicillium antarcticum. Polar Biology 21:97-111.

1999b. Filamentous fungi of the Windmill Islands, 25 McRae, C.F. and Seppelt, R. 1999b. Filamentous fungi of the Windmill Islands,

continental Antarctica.Effect of water content in moss turves on fungal diversity. Polar 26 continental Antarctica.Effect of water content in moss turves on fungal diversity. Polar

:389-394. 27 Biology 22:389-394.

137

2014. Waste not, want not: why rarefying microbiome 1 McMurdie, P. and Holmes, S. 2014. Waste not, want not: why rarefying microbiome

data is inadmissible. PLoS Comput Biol 10(4): e1003531. doi: 2 data is inadmissible. PLoS Comput Biol 10(4): e1003531. doi:

10.1371/journal.pcbi.1003531. 3 10.1371/journal.pcbi.1003531.

1994. Phytogeography of bryophyte and 4 Melick, D., Hovenden.M. and Seppelt, R. 1994. Phytogeography of bryophyte and

lichen vegetation in the Windmill Islands, Wilkes Land, Continental Antarctica. 5 lichen vegetation in the Windmill Islands, Wilkes Land, Continental Antarctica.

:71-87. 6 Vegetatio 111:71-87.

7 Miadlikowska, J., Kauff, F., Hofstetter, V., Fraker, E., Grube, M., Hafellner, J.,

2006. New 8 Reeb, V., Hodkinson, B.P., Kukwa, M., Lücking, R. and Hestmark, G. 2006. New

9 insights into classification and evolution of the Lecanoromycetes (Pezizomycotina,

) from phylogenetic analyses of three ribosomal RNA- and two 10 Ascomycota) from phylogenetic analyses of three ribosomal RNA- and two

:1088-1103. 11 protein-coding genes. Mycologia 98:1088-1103.

2007. Bacterial diversity associated with Blood Falls, a 12 Mikucki, J.A. and Priscu, J.C. 2007. Bacterial diversity associated with Blood Falls, a

subglacial outflow from the Taylor Glacier, Antarctica. Applied and Environmental 13 subglacial outflow from the Taylor Glacier, Antarctica. Applied and Environmental

4029-4039. 14 Microbiology 73:4029-4039.

2009. Antibiotic resistance among bacteria 15 Miller, R.V., Gammon, K. and Day, M.J. 2009. Antibiotic resistance among bacteria isolated from seawater and penguin fecal samples collected near Palmer Station, 16 isolated from seawater and penguin fecal samples collected near Palmer Station,

:37-45. 17 Antarctica. Canadian Journal of Microbiology 55:37-45.

18 Mills, H., Reese, B., Shepard, Al., Riedinger, N., Dowd, S., Morono, Y., Inagaki, F.

2012. Characterization of metabolically active bacterial populations in subseafloor 19 2012. Characterization of metabolically active bacterial populations in subseafloor

N ankai Trough sediments above, within, and below the sulfate–methane transition zone. 20 Nankai Trough sediments above, within, and below the sulfate–methane transition zone.

:113 21 Frontiers in Microbiology 3:113

22 Mori, T., Takahashi, K., Kashiwabara, M., Uemura, D., Katayama, C., Iwadare, S.,

1985. Structure of 23 Shizuri, Y., Mitomo, R., Nakano, F. and Matsuzaki, A. 1985. Structure of

:1073-1076. 24 oxazolomycin, a novel β-lactone antibiotic. Tetrahedron Letters 26:1073-1076.

2007. KAAS: an 25 Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A.C. and Kanehisa, M. 2007. KAAS: an

automatic genome annotation and pathway reconstruction server. Nucleic Acids 26 automatic genome annotation and pathway reconstruction server. Nucleic Acids

:W 182-W185. 27 Research 35:W182-W185.

138

2008. Biotechnological 1 Mothes, G., Schubert, T., Harms, H. and Maskow, T. 2008. Biotechnological

coproduction of compatible solutes and polyhydroxyalkanoates using the genus 2 coproduction of compatible solutes and polyhydroxyalkanoates using the genus

:658-662. 3 Halomonas. Engineering in Life Sciences 8:658-662.

4 Mueller-Cajar, O., Stotz, M., Wendler, P., Hartl, F.U., Bracher, A. and Hayer-Hartl,

2011. Structure and function of the AAA+ protein CbbX, a red-type Rubisco 5 M. 2011. Structure and function of the AAA+ protein CbbX, a red-type Rubisco

:194-199. 6 activase. Nature 479:194-199.

2013. Localization of bacteria in 7 Muggia, L., Klug, B., Berg, G. and Grube, M. 2013. Localization of bacteria in hybridization. Applied Soil 8 lichens from Alpine soil crusts by fluorescence in situ hybridization. Applied Soil

:20-25. 9 Ecology 68:20-25.

2011. Thermophilic 10 Muñoz, P.A., Flores, P.A., Boehmwald, F.A. and Blamey, J.M. 2011. Thermophilic

bacteria present in a sample from Fumarole Bay, Deception Island. Antarctic Science 11 bacteria present in a sample from Fumarole Bay, Deception Island. Antarctic Science

:549-555. 12 23:549-555.

13 Nakae, S., Ito, S., Higa, M., Senoura, T., Wasaki, J., Hijikata, A., Shionyu, M., Ito,

2013. Structure of novel enzyme in mannan biodegradation process 14 S. and Shirai, T. 2013. Structure of novel enzyme in mannan biodegradation process

4-O-β-d-mannosyl-d-glucose phosphorylase MGP. Journal of Molecular Biology 15 4-O-β-d-mannosyl-d-glucose phosphorylase MGP. Journal of Molecular Biology

:4468-4478. 16 425:4468-4478.

17 Niederberger, T.D., Sohm, J.A., Gunderson, T.E., Parker, A.E., Tirindelli, J.,

2015. Microbial community 18 Capone, D.G., Carpenter, E.J. and Cary, S.C. 2015. Microbial community

com position of transiently wetted Antarctic Dry Valley soils. Frontiers in Microbiology 19 composition of transiently wetted Antarctic Dry Valley soils. Frontiers in Microbiology

.doi: 10.3389/fmicb.2015.00009. 20 6.doi: 10.3389/fmicb.2015.00009.

21 Nessner Kavamura, V., Taketani, R.G., Lanconi, M.D., Andreote, F.D., Mendes, R.

2013. Water regime influences bulk soil and rhizosphere of 22 and Soares de Melo, I. 2013. Water regime influences bulk soil and rhizosphere of

bacterial communities in the Brazilian Caatinga biome. PLoS One 23 Cereus jamacaru bacterial communities in the Brazilian Caatinga biome. PLoS One

:e73606. doi:10.1371/journal.pone.0073606. 24 8:e73606. doi:10.1371/journal.pone.0073606.

2012. The effect of training set on the classification of 25 Newton, I.L. and Roeselers, G. 2012. The effect of training set on the classification of

:1. 26 honey bee gut using the Naive Bayesian Classifier. BMC Microbiology 12:1.

2016. A perspective on 16S rRNA 27 Nguyen, N.-P., Warnow, T., Pop, M. and White, B. 2016. A perspective on 16S rRNA

139

operational taxonomic unit clustering using sequence similarity. npj and 1 operational taxonomic unit clustering using sequence similarity. npj Biofilms and

:16004. 2 Microbiomes 2:16004.

3 Nichols, D., Bowman, J., Sanderson, K., Nichols, C.M., Lewis, T., McMeekin, T.

1999. Developments with Antarctic microorganisms: culture 4 and Nichols, P.D. 1999. Developments with Antarctic microorganisms: culture

collections, bioactivity screening, taxonomy, PUFA production and cold-adapted 5 collections, bioactivity screening, taxonomy, PUFA production and cold-adapted

:240-246. 6 enzymes. Current Opinion in Biotechnology 10:240-246.

7 Niederberger, T.D., McDonald, I.R., Hacker, A.L., Soo, R.M., Barrett, J.E., Wall,

2008. Microbial community composition in soils of Northern 8 D.H. and Cary, S.C. 2008. Microbial community composition in soils of Northern

:1713-1724. 9 Victoria Land, Antarctica. Environmental Microbiology 10:1713-1724.

10 Nogales, B., Moore, E.R., Llobet-Brossa, E., Rossello-Mora, R., Amann, R. and

2001. Combined use of 16S ribosomal DNA and 16S rRNA to study the 11 Timmis, K.N. 2001. Combined use of 16S ribosomal DNA and 16S rRNA to study the

bacterial community of polychlorinated biphenyl-polluted soil. Applied Environmental 12 bacterial community of polychlorinated biphenyl-polluted soil. Applied Environmental

:1874-1884. 13 Microbiology 67:1874-1884.

14 Nordberg, H., Cantor, M., Dusheyko, S., Hua, S., Poliakov, A., Shabalov, I.,

2014. The genome portal of the 15 Smirnova, T., Grigoriev, I.V. and Dubchak, I. 2014. The genome portal of the D epartment of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Research 16 Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Research

:D26-D31. 17 42:D26-D31.

18 Novis, P.M., Whitehead, D., Gregorich, E.G., Hunt, J.E., Sparrow, A.D., Hopkins,

2007. Annual carbon fixation in terrestrial 19 D.W., Elberling, B.O. and Greenfield, L.G. 2007. Annual carbon fixation in terrestrial

(Cyanobacteria) from an Antarctic dry valley is driven 20 populations of Nostoc commune (Cyanobacteria) from an Antarctic dry valley is driven

:1224-1237. 21 by temperature regime.Global Change Biology 13:1224-1237.

2005. The viable but nonculturable state in bacteria. The Journal of 22 Oliver, J.D. 2005. The viable but nonculturable state in bacteria. The Journal of

:93-100. 23 Microbiology 43:93-100.

24 Omura, S., Kitao, C., Tanaka, H., Oiwa, R., Takahashi, Y., Nakagawa, A., Shimada,

1976. A new antibiotic, asukamycin, produced by Streptomyces. The 25 M. and Iwai, Y. 1976. A new antibiotic, asukamycin, produced by Streptomyces. The

:876-881. 26 Journal of Antibiotics 29:876-881.

27 Oppenheimer, C., Kyle, P., Tsanev, V., McGonigle, A., Mather, T. and Sweeney, D.

140

in Antarctica. Atmospheric 1 2005. Mt. Erebus, the largest point source of NO2 in Antarctica. Atmospheric

:6000-6006. 2 Environment 39:6000-6006.

3 Ortiz, M., Legatzki, A., Neilson, J.W., Fryslie, B., Nelson, W.M., Wing, R.A.,

2014. Making a living while starving 4 Soderlund, C.A., Pryor, B.M. and Maier, R.M. 2014. Making a living while starving

in the dark: metagenomic insights into the energy dynamics of a carbonate cave. The 5 in the dark: metagenomic insights into the energy dynamics of a carbonate cave. The

:478-491. 6 ISME Journal 8:478-491.

2013. Denaturing gradient 7 Pan, Q., Wang, F., Zhang, Y., Cai, M., He, J. and Yang, H. 2013. Denaturing gradient gel electrophoresis fingerprinting of soil bacteria in the vicinity of the Chinese Great 8 gel electrophoresis fingerprinting of soil bacteria in the vicinity of the Chinese Great

W all Station, King George Island, Antarctica. Journal of Environmental Sciences 9 Wall Station, King George Island, Antarctica. Journal of Environmental Sciences

:1649-1655. 10 25:1649-1655.

11 Park, S.W., Hwang, E.H., Jang, H.S., Lee, J.H., Kang, B.S., Oh, J.I. and Kim, Y.M.

2009. Presence of duplicate genes encoding a phylogenetically new subgroup of form I 12 2009. Presence of duplicate genes encoding a phylogenetically new subgroup of form I

sp. strain JC1 13 ribulose 1, 5-bisphosphate carboxylase/oxygenase in Mycobacterium sp. strain JC1

:159-165. 14 DSM 3803. Research in Microbiology 160:159-165.

2015. 15 Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P. and Tyson, G.W. 2015. C heckM: assessing the quality of microbial genomes recovered from isolates, single 16 CheckM: assessing the quality of microbial genomes recovered from isolates, single

:1043-1055. 17 cells, and metagenomes. Genome Research 25:1043-1055.

18 Pearce, D.A., Newsham, K.K., Thorne, M.A., Calvo-Bado, L., Krsek, M., Laskaris,

2012. Metagenomic analysis of a southern 19 P., Hodson, A. and Wellington, E.M. 2012. Metagenomic analysis of a southern

: doi:10.3389/fmicb.2012.00403. 20 maritime Antarctic soil. Frontiers in Microbiology 3: doi:10.3389/fmicb.2012.00403.

hybridization 21 Pernthaler, A., Pernthaler, J. and Amann, R. 2002. Fluorescence in situ hybridization

and catalyzed reporter deposition for the identification of marine bacteria. Applied and 22 and catalyzed reporter deposition for the identification of marine bacteria. Applied and

:3094-3101. 23 Environmental Microbiology 68:3094-3101.

1989. PHYLIP-phylogeny inference package (version 24 Plotree, D. and Plotgram, D. 1989. PHYLIP-phylogeny inference package (version

:163-166. 25 3.2). Cladistics 5:163-166.

26 Pointing, S.B., Chan, Y., Lacap, D.C., Lau, M.C., Jurgens, J.A. and Farrell, R.L.

2009. Highly specialized microbial diversity in hyper-arid polar desert. Proceedings of 27 2009. Highly specialized microbial diversity in hyper-arid polar desert. Proceedings of

141

:19964-19969. 1 the National Academy of Sciences 106:19964-19969.

2006. Using real-time 2 Powell, S.M., Ferguson, S.H., Bowman, J.P. and Snape, I. 2006. Using real-time

PC R to assess changes in the hydrocarbon-degrading microbial community in Antarctic 3 PCR to assess changes in the hydrocarbon-degrading microbial community in Antarctic

:523-532. 4 soil during bioremediation. Microbial Ecology 52:523-532.

2005. Location and DGGE 5 Powell, S.M., Riddle, M.J., Snape, I. and Stark, J.S. 2005. Location and DGGE

m ethodology can influence interpretation of field experimental studies on the response 6 methodology can influence interpretation of field experimental studies on the response

to hydrocarbons by Antarctic benthic microbial community. Antarctic Science 7 to hydrocarbons by Antarctic benthic microbial community. Antarctic Science

:353-360. 8 17:353-360.

2010. FastTree 2--approximately 9 Price, M.N., Dehal, P.S. and Arkin, A.P. 2010. FastTree 2--approximately

:e9490. 10 maximum-likelihood trees for large alignments. PLoS One 5:e9490.

11 Pruesse, E., Quast, C., Knittel, K., Fuchs, B.M., Ludwig, W., Peplies, J. and

2007. SILVA: a comprehensive online resource for quality checked and 12 Glöckner, F.O. 2007. SILVA: a comprehensive online resource for quality checked and

aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Research 13 aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Research

:7188-7196. 14 35:7188-7196.

2012. SINA: accurate high-throughput 15 Pruesse, E., Peplies, J. and Glöckner, F.O. 2012. SINA: accurate high-throughput

:1823-1829. 16 multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28:1823-1829.

2003. Analysis of the sulfate-reducing bacterial 17 Purdy, K., Nedwell, D. and Embley, T. 2003. Analysis of the sulfate-reducing bacterial

and methanogenic archaeal populations in contrasting Antarctic sediments. Applied 18 and methanogenic archaeal populations in contrasting Antarctic sediments. Applied

:3181-3191. 19 Environmental Microbiology 69:3181-3191.

20 Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., Peplies, J. and

2013. The SILVA ribosomal RNA gene database project: improved data 21 Glockner, F.O. 2013. The SILVA ribosomal RNA gene database project: improved data

:D590-D596. 22 processing and web-based tools. Nucleic Acids Research 41:D590-D596.

23 Quere, C.L., Harrison, S.P., Colin Prentice, I., Buitenhuis, E.T., Aumont, O., Bopp,

2005. 24 L., Claustre, H., Cotrim Da Cunha, L., Geider, R., Giraud, X. and Klaas, C. 2005.

E cosystem dynamics based on plankton functional types for global ocean 25 Ecosystem dynamics based on plankton functional types for global ocean

:2016-2040. 26 biogeochemistry models. Global Change Biology 11:2016-2040.

2011. Removing noise 27 Quince, C., Lanzen, A., Davenport, R.J. and Turnbaugh, P.J. 2011. Removing noise

142

: 38. 1 from pyrosequenced amplicons. BMC Bioinformatics 12: 38.

2 Raghunathan, A., Ferguson, H.R., Bornarth, C.J., Song, W., Driscoll, M. and

2005. Genomic DNA amplification from a single bacterium. Applied and 3 Lasken, R.S. 2005. Genomic DNA amplification from a single bacterium. Applied and

:3342-3347. 4 Environmental Microbiology 71:3342-3347.

2007. Multiscale responses of microbial life to spatial 5 Ramette, A. and Tiedje, J.M. 2007. Multiscale responses of microbial life to spatial

distance and environmental heterogeneity in a patchy ecosystem. Proceedings of the 6 distance and environmental heterogeneity in a patchy ecosystem. Proceedings of the

:2761-2766. 7 National Academy of Sciences 104:2761-2766.

8 Rampelotto, P.H., Barboza, A.D.M., Pereira, A.B., Triplett, E.W., Schaefer, C.E.G.,

2015. Distribution and interaction 9 de Oliveira Camargo, F.A. and Roesch, L.F.W. 2015. Distribution and interaction

patterns of bacterial communities in an ornithogenic soil of Seymour Island, Antarctica. 10 patterns of bacterial communities in an ornithogenic soil of Seymour Island, Antarctica.

:684-694. 11 Microbial Ecology 69:684-694.

2004. The natural 12 Raymond, J., Siefert, J.L., Staples, C.R. and Blankenship, R.E. 2004. The natural

:541-554. 13 history of nitrogen fixation. Molecular Biology and Evolution 21:541-554.

2015. The use of microbial gene 14 Richardson, E.L., King, C.K. and Powell, S.M. 2015. The use of microbial gene

abundance in the development of fuel remediation guidelines in polar soils. Integrated 15 abundance in the development of fuel remediation guidelines in polar soils. Integrated

:235-241. 16 Environmental Assessment and Management 11:235-241.

17 Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.-F.,

2013. Insights into the phylogeny 18 Darling, A., Malfatti, S., Swan, B.K. and Gies, E.A. 2013. Insights into the phylogeny

:431-437. 19 and coding potential of microbial dark matter. Nature 499:431-437.

20 Roesch, L.F., Fulthorpe, R.R., Pereira, A.B., Pereira, C.K., Lemos, L.N., Barbosa,

21 A.D., Suleiman, A.K., Gerber, A.L., Pereira, M.G., Loss, A. and da Costa, E.M.

2012. Soil bacterial community abundance and diversity in ice-free areas of Keller 22 2012. Soil bacterial community abundance and diversity in ice-free areas of Keller

:7-15. 23 Peninsula, Antarctica. Applied Soil Ecology 61:7-15.

1993. Microbiology of ornithogenic soils from 24 Roser, D., Seppelt, R. and Ashbolt, N. 1993. Microbiology of ornithogenic soils from

the Windmill Islands, Budd Coast, continental Antarctica: microbial biomass 25 the Windmill Islands, Budd Coast, continental Antarctica: microbial biomass

:165-175. 26 distribution. Soil Biology and Biochemistry 25:165-175.

27 Rousk, J., Bååth, E., Brookes, P.C., Lauber, C.L., Lozupone, C., Caporaso, J.G.,

143

2010. Soil bacterial and fungal communities across a pH 1 Knight, R. and Fierer, N. 2010. Soil bacterial and fungal communities across a pH

:1340-1351. 2 gradient in an arable soil. The ISME Journal 4:1340-1351.

3 Royles, J., Amesbury, M.J., Convey, P., Griffiths, H., Hodgson, D.A., Leng, M.J.

2013. Plants and soil microbes respond to recent warming on the 4 and Charman, D.J. 2013. Plants and soil microbes respond to recent warming on the

:1702-1706. 5 Antarctic Peninsula. Current Biology 23:1702-1706.

2003. Effectiveness of the 6 Ruberto, L., Vazquez, S.C. and Mac Cormack, W.P. 2003. Effectiveness of the

natural bacterial flora, biostimulation and bioaugmentation on the bioremediation of a 7 natural bacterial flora, biostimulation and bioaugmentation on the bioremediation of a

hydrocarbon contaminated Antarctic soil. International Biodeterioration & 8 hydrocarbon contaminated Antarctic soil. International Biodeterioration &

:115-125. 9 Biodegradation 52:115-125.

, Eds. 2001. Molecular cloning: a laboratory manual. 10 Sambrook, J. and Sambrook, J., Eds. 2001. Molecular cloning: a laboratory manual.

C old spring harbor, New York, Cold spring harbor laboratory press. 11 Cold spring harbor, New York, Cold spring harbor laboratory press.

2016. Recovering complete and draft 12 Sangwan, N., Xia, F. and Gilbert, J.A. 2016. Recovering complete and draft

:1. 13 population genomes from metagenome datasets. Microbiome 4:1.

2006. Isolation, characterization, and ecology of 14 Sattley, W.M. and Madigan, M.T. 2006. Isolation, characterization, and ecology of

cold-active, chemolithotrophic, sulfur-oxidizing bacteria from perennially ice-covered 15 cold-active, chemolithotrophic, sulfur-oxidizing bacteria from perennially ice-covered

:5562-5568. 16 Lake Fryxell, Antarctica. Applied and Environmental Microbiology. 72:5562-5568.

2010. A biosynthetic pathway for BE-7585A, 17 Sasaki, E., Ogasawara, Y. and Liu, H.-w. 2010. A biosynthetic pathway for BE-7585A,

a 2-thiosugar-containing angucycline-type natural product. Journal of the American 18 a 2-thiosugar-containing angucycline-type natural product. Journal of the American

:7405-7417. 19 Chemical Society 132:7405-7417.

20 Schloss, P.D., Westcott, S.L., Ryabin, T., Hall, J.R., Hartmann, M., Hollister, E.B.,

2009. 21 Lesniewski, R.A., Oakley, B.B., Parks, D.H., Robinson, C.J. and Sahl, J.W. 2009.

Introducing mothur: open-source, platform-independent, community-supported software 22 Introducing mothur: open-source, platform-independent, community-supported software

for describing and comparing microbial communities. Applied Environmental 23 for describing and comparing microbial communities. Applied Environmental

:7537-7541. 24 Microbiology 75:7537-7541.

2014. Detection and quantification of native microbial 25 Schmidt, H. and Eickhorst, T. 2014. Detection and quantification of native microbial

26 populations on soil-grown rice roots by catalyzed reporter deposition-fluorescence in

:390-402. 27 situ hybridization. FEMS Microbiology Ecology 87:390-402.

144

2002. 1 Schramm, A., Fuchs, B.M., Nielsen, J.L., Tonolla, M. and Stahl, D.A. 2002.

FISH) for probe 2 Fluorescence in situ hybridization of 16S rRNA gene clones (Clone‐FISH) for probe

:713-720. 3 validation and screening of clone libraries. Environmental Microbiology 4:713-720.

4 Schwartz, E., Voigt, B., Zühlke, D., Pohlmann, A., Lenz, O., Albrecht, D., Schwarze,

A proteomic view of 5 A., Kohlmann, Y., Krause, C., Hecker, M., Friedrich, B. 2009. A proteomic view of

H16. Proteomics 6 the facultatively chemolithoautotrophic lifestyle of Ralstonia eutropha H16. Proteomics

7 9:5132-5142.

8 Schwartz, E., Van Horn, D.J., Buelow, H.N., Okie, J.G., Gooseff, M.N., Barrett, J.E.

2014. Characterization of growing bacterial populations in 9 and Takacs-Vesbach, C.D. 2014. Characterization of growing bacterial populations in

M cM urdo Dry Valley soils through stable isotope probing with 18O-water. FEMS 10 McMurdo Dry Valley soils through stable isotope probing with 18O-water. FEMS

:415-425. 11 Microbiology Ecology 89:415-425.

2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 12 Seemann, T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics

:2068-2069. 13 30:2068-2069.

14 Segawa, T., Takeuchi, N., Rivera, A., Yamada, A., Yoshimura, Y., Barcaza, G.,

2013. Distribution of 15 Shinbori, K., Motoyama, H., Kohshima, S. and Ushida, K. 2013. Distribution of antibiotic resistance genes in glacier environments. Environmental Microbiology 16 antibiotic resistance genes in glacier environments. Environmental Microbiology

:127-134. 17 Reports 5:127-134.

2013. Pyrosequencing-Based 18 Serkebaeva, Y.M., Kim, Y., Liesack, W., Dedysh, S.N. 2013. Pyrosequencing-Based

assessment of the Bacteria diversity in surface and subsurface Peat layers of a northern 19 assessment of the Bacteria diversity in surface and subsurface Peat layers of a northern

w etland, with focus on poorly studied phyla and candidate divisions. PLoS One 20 wetland, with focus on poorly studied phyla and candidate divisions. PLoS One

:e63994. doi:10.1371/journal.pone.0063994. 21 8:e63994. doi:10.1371/journal.pone.0063994.

2013. [NiFe] hydrogenases: a 22 Shafaat, H.S., Rüdiger, O., Ogata, H. and Lubitz, W. 2013. [NiFe] hydrogenases: a

com mon for hydrogen metabolism under diverse conditions. Biochimica et 23 common active site for hydrogen metabolism under diverse conditions. Biochimica et

:986-1002. 24 Biophysica Acta (BBA)-Bioenergetics 1827:986-1002.

25 Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N.,

2003. Cytoscape: a software environment for 26 Schwikowski, B. and Ideker, T. 2003. Cytoscape: a software environment for

integrated models of biomolecular interaction networks. Genome Research 27 integrated models of biomolecular interaction networks. Genome Research

145

:2498-2504. 1 13:2498-2504.

2 Sharon, I., Morowitz, M.J., Thomas, B.C., Costello, E.K., Relman, D.A. and

2013. Time series community genomics analysis reveals rapid shifts in 3 Banfield, J.F. 2013. Time series community genomics analysis reveals rapid shifts in

bacterial species, strains, and phage during infant gut colonization. Genome research 4 bacterial species, strains, and phage during infant gut colonization. Genome research

:111-120. 5 23:111-120.

2007. Molecular 6 Shravage, B.V., Dayananda, K.M., Patole, M.S. and Shouche, Y.S. 2007. Molecular

m icrobial diversity of a soil sample and detection of ammonia oxidizers from Cape 7 microbial diversity of a soil sample and detection of ammonia oxidizers from Cape

:15-25. 8 Evans, Mcmurdo Dry Valley, Antarctica. Microbiological Research 162:15-25.

9 Siciliano, S.D., Palmer, A.S., Winsley, T., Lamb, E., Bissett, A., Brown, M.V., van

2014. Soil fertility is associated with 10 Dorst, J., Ji, M., Ferrari, B.C. and Grogan, P. 2014. Soil fertility is associated with

fungal and bacterial richness, whereas pH is associated with community composition in 11 fungal and bacterial richness, whereas pH is associated with community composition in

:10-20. 12 polar soil microbial communities. Soil Biology and Biochemistry 78:10-20.

13 Siciliano, S.D., Palmer, A.S., Winsley, T., Lamb, E., Bissett, A., Brown, M.V., van

(2013, updated 14 Dorst, J., Ji, M., Ferrari, B.C., Grogan, P., Chu, H. and Snape, I. (2013, updated

2014). Polar soil bacterial and fungal biodiversity survey. A. A. D. Centre. 15 2014). Polar soil bacterial and fungal biodiversity survey. A. A. D. Centre.

16 Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.M. and Birol, I.

2009. ABySS: a parallel assembler for short read sequence data. Genome Research 17 2009. ABySS: a parallel assembler for short read sequence data. Genome Research

:1117-1123. 18 19:1117-1123.

2012. A hump-backed 19 Singh, D., Takahashi, K., Kim, M., Chun, J. and Adams, J.M. 2012. A hump-backed

trend in bacterial diversity with elevation on Mount Fuji, Japan. Microbial Ecology 20 trend in bacterial diversity with elevation on Mount Fuji, Japan. Microbial Ecology

:429-437. 21 63:429-437.

22 Sizova, M.V., Hohmann, T., Hazen, A., Paster, B.J., Halem, S.R., Murphy, C.M.,

2012. New approaches for isolation of previously 23 Panikov, N.S. and Epstein, S.S. 2012. New approaches for isolation of previously

:194-203. 24 uncultivated oral bacteria. Applied and Environmental Microbiology 78:194-203.

2000. Detecting human bacterial contamination in 25 Sjöling, S. and Cowan, D.A. 2000. Detecting human bacterial contamination in

:644-650. 26 Antarctic soils. Polar Biology 23:644-650.

2006. Bacterial diversity 27 Smith, J.J., Tow, L.A., Stafford, W., Cary, C., Cowan, D.A. 2006. Bacterial diversity

146

:413-421. 1 in three different Antarctic cold desert mineral soils. Microbial Ecology 51:413-421.

1994. Survival, physiological 2 Smith, J.J., Howington, J.P. and McFETERS, G.A. 1994. Survival, physiological

response and recovery of enteric bacteria exposed to a polar marine environment. 3 response and recovery of enteric bacteria exposed to a polar marine environment.

:2977-2984. 4 Applied and Environmental Microbiology 60:2977-2984.

2016. The bright side of microbial dark matter: 5 Solden, L., Lloyd, K. and Wrighton, K. 2016. The bright side of microbial dark matter:

lessons learned from the uncultivated majority. Current Opinion in Microbiology 6 lessons learned from the uncultivated majority. Current Opinion in Microbiology

:217-226. 7 31:217-226.

8 Soo, R.M., Skennerton, C.T., Sekiguchi, Y., Imelfort, M., Paech, S.J., Dennis, P.D.,

2014. An expanded 9 Steen, J.A., Parks, D.H., Tyson, G.W. and Hugenholtz, P. 2014. An expanded

genom ic representation of the phylum Cyanobacteria. Genome Biology and Evolution 10 genomic representation of the phylum Cyanobacteria. Genome Biology and Evolution

:1031-1045. 11 6:1031-1045.

1984. Ornithogenic soils of the Cape Bird adelie penguin 12 Speir, T. and Cowling, J. 1984. Ornithogenic soils of the Cape Bird adelie penguin

:199-205. 13 rookeries, Antarctica. Polar Biology 2:199-205.

2012. Physiology and diversity of 14 Stahl, D.A. and de la Torre, J.R. 2012. Physiology and diversity of

:83-101. 15 ammonia-oxidizing archaea. Annual Review of Microbiology 66:83-101.

2012. Physical, chemical and microbial 16 Stewart, K.J., Snape, I. and Siciliano, S.D. 2012. Physical, chemical and microbial

soil properties of frost boils at Browning Peninsula, Antarctica. Polar Biology 17 soil properties of frost boils at Browning Peninsula, Antarctica. Polar Biology

:463-468. 18 35:463-468.

19 Stott, M.B., Crowe, M.A., Mountain, B.W., Smirnova, A.V., Hou, S., Alam, M.,

2008. Isolation of novel bacteria, including a candidate division, from 20 Dunfield, P.F. 2008. Isolation of novel bacteria, including a candidate division, from

:413-421. 21 geothermal soils in New Zealand. Environmental Microbiology 54:413-421.

2013. Exploring the potential 22 Su, X., Chen, X., Hu, J., Shen, C. and Ding, L. 2013. Exploring the potential

environmental functions of viable but non-culturable bacteria. World Journal of 23 environmental functions of viable but non-culturable bacteria. World Journal of

:2213-2218. 24 Microbiology and Biotechnology 29:2213-2218.

2009. communities in response to recent changes in 25 Sutherland, D.L. 2009. Microbial mat communities in response to recent changes in

the physiochemical environment of the meltwater ponds on the McMurdo Ice Shelf, 26 the physiochemical environment of the meltwater ponds on the McMurdo Ice Shelf,

:1023-1032. 27 Antarctica. Polar Biology 32:1023-1032.

147

2008. Distinct 1 Tabita, F.R., Satagopan, S., Hanson, T.E., Kreel, N.E. and Scott, S.S. 2008. Distinct

form I, II, III, and IV Rubisco proteins from the three kingdoms of life provide clues 2 form I, II, III, and IV Rubisco proteins from the three kingdoms of life provide clues

about Rubisco evolution and structure/function relationships. Journal of experimental 3 about Rubisco evolution and structure/function relationships. Journal of experimental

:1515-1524. 4 botany 59:1515-1524.

, and 5 Tahon, G., Tytgat, B., Stragier, P. and Willems, A. 2016. Analysis of cbbL, nifH, and

in Soils from the Sør Rondane Mountains, Antarctica, Reveals a Large Diversity 6 pufLM in Soils from the Sør Rondane Mountains, Antarctica, Reveals a Large Diversity

:131-149. 7 of Autotrophic and Phototrophic Bacteria. Microbial Ecology 71:131-149.

2003. 8 Taton, A., Grubisic, S., Brambilla, E., De Wit, R. and Wilmotte, A. 2003.

C yanobacterial diversity in natural and artificial microbial mats of Lake Fryxell 9 Cyanobacterial diversity in natural and artificial microbial mats of Lake Fryxell

(M cMurdo Dry Valleys, Antarctica): a morphological and molecular approach. Applied 10 (McMurdo Dry Valleys, Antarctica): a morphological and molecular approach. Applied

:5157-5169. 11 and Environmental Microbiology 69:5157-5169.

12 Tebo, B.M., Davis, R.E., Anitori, R.P., Connell, L.B., Schiffman, P. and Staudigel,

2015. Microbial communities in dark oligotrophic volcanic ice cave ecosystems of 13 H. 2015. Microbial communities in dark oligotrophic volcanic ice cave ecosystems of

:179. 14 Mt. Erebus, Antarctica. Frontiers Microbiology 6:179.

15 Teixeira, L.C., Peixoto, R.S., Cury, J.C., Sul, W.J., Pellizari, V.H. and Tiedje, J.,

2010. Bacterial diversity in rhizosphere soil from Antarctic vascular 16 Rosado, A.S. 2010. Bacterial diversity in rhizosphere soil from Antarctic vascular

:989-1001. 17 plants of Admiralty Bay, maritime Antarctica. The ISME Journal 4:989-1001.

18 Terauds, A., Chown, S.L., Morgan, F., J Peat, H., Watts, D.J., Keys, H., Convey, P.

2012. Conservation biogeography of the Antarctic. Diversity and 19 and Bergstrom, D.M. 2012. Conservation biogeography of the Antarctic. Diversity and

:726-741. 20 Distributions 18:726-741.

2014. Analyses of soil bacterial diversity of the 21 Teo, J.K.C. and Wong, C.M.V.L. 2014. Analyses of soil bacterial diversity of the

:631-640. 22 Schirmacher Oasis, Antarctica. Polar Biology 37:631-640.

2012. Rapid 23 Tiao, G., Lee, C.K., McDonald, I.R., Cowan, D.A. and Cary, S.C. 2012. Rapid m icrobial response to the presence of an ancient relic in the Antarctic Dry Valleys. 24 microbial response to the presence of an ancient relic in the Antarctic Dry Valleys.

: 660. 25 Nature Communications 3: 660.

2004. Prokaryotic diversity in the Antarctic: the tip of the iceberg. 26 Tindall, B.J. 2004. Prokaryotic diversity in the Antarctic: the tip of the iceberg.

:271-283. 27 Microbial Ecology 47:271-283.

148

1 Tujula, N.A., Holmström, C., Mußmann, M., Amann, R., Kjelleberg, S. and

2006. A CARD–F ISH protocol for the identification and enumeration of 2 Crocetti, G.R. 2006. A CARD–FISH protocol for the identification and enumeration of

:604-607. 3 on marine algae. Journal of Microbiological Methods 65:604-607.

4 Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson,

2004. 5 P.M., Solovyev, V.V., Rubin, E.M., Rokhsar, D.S. and Banfield, J.F. 2004.

C omm unity structure and metabolism through reconstruction of microbial genomes 6 Community structure and metabolism through reconstruction of microbial genomes

:37-43. 7 from the environment. Nature 428:37-43.

8 van Dorst, J., Bissett, A., Palmer, A.S., Brown, M., Snape, I., Stark, J.S., Raymond,

2014. Community 9 B., McKinlay, J., Ji, M., Winsley, T. and Ferrari, B.C. 2014. Community

:316-330. 10 fingerprinting in a sequencing world. FEMS Microbiology Ecology 89:316-330.

2014. Bacterial 11 van Dorst, J., Siciliano, S.D., Winsley, T., Snape, I. and Ferrari, B.C. 2014. Bacterial

targets as potential indicators of diesel fuel toxicity in subantarctic soils. Applied and 12 targets as potential indicators of diesel fuel toxicity in subantarctic soils. Applied and

:4021-4033. 13 Environmental Microbiology 80:4021-4033.

14 Van Goethem, M.W., Makhalanyane, T.P., Valverde, A., Cary, S.C. and Cowan,

2016. Characterization of bacterial communities in lithobionts and soil niches 15 D.A. 2016. Characterization of bacterial communities in lithobionts and soil niches

:fiw051. 16 from Victoria Valley, Antarctica. FEMS Microbiology Ecology 92:fiw051.

17 Van Horn, D.J., Van Horn, M.L., Barrett, J.E., Gooseff, M.N., Altrichter, A.E.,

2013. Factors controlling soil 18 Geyer, K.M., Zeglin, L.H. and Takacs-Vesbach, C.D. 2013. Factors controlling soil

m icrobial biomass and bacterial diversity and community composition in a cold desert 19 microbial biomass and bacterial diversity and community composition in a cold desert

:e66103. 20 ecosystem: role of geographic scale. PLoS One 8:e66103.

1996. 21 Vandamme, P., Pot, B., Gillis, M., De Vos, P., Kersters, K. and Swings, J. 1996.

Po lyphasic taxonomy, a consensus approach to bacterial systematics. Microbiological 22 Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiological

:407-438. 23 Reviews 60:407-438.

2012. 24 Varin, T., Lovejoy, C., Jungblut, A.D., Vincent, W.F. and Corbeil, J. 2012.

M etagenomic analysis of stress genes in microbial mat communities from Antarctica 25 Metagenomic analysis of stress genes in microbial mat communities from Antarctica

:549-559. 26 and the High Arctic. Applied Environmental Microbiology 78:549-559.

2010. Strategies for culture of 27 Vartoukian, S.R., Palmer, R.M. and Wade, W.G. 2010. Strategies for culture of

149

:1-7. 1 ‘unculturable’bacteria. FEMS Microbiology Letters 309:1-7.

2011. Ammonia concentration 2 Verhamme, D.T., Prosser, J.I. and Nicol, G.W. 2011. Ammonia concentration

determines differential growth of ammonia-oxidising archaea and bacteria in soil 3 determines differential growth of ammonia-oxidising archaea and bacteria in soil

:1067-1071. 4 microcosms. The ISME Journal 5:1067-1071.

2007. Occurrence, classification, and biological function 5 Vignais, P.M. and Billoud, B. 2007. Occurrence, classification, and biological function

:4206-4272. 6 of hydrogenases: an overview. Chemical Reviews 107:4206-4272.

1993. 7 Vincent, W., Downes, M., Castenholz, R. and Howard-Williams, C. 1993. C omm unity structure and pigment organisation of cyanobacteria-dominated microbial 8 Community structure and pigment organisation of cyanobacteria-dominated microbial

:213-221. 9 mats in Antarctica. European Journal of Phycology 28:213-221.

2012. Cyanobacteria in high latitude lakes, rivers and 10 Vincent, W.F. and Quesada, A. 2012. Cyanobacteria in high latitude lakes, rivers and

seas. Ecology of Cyanobacteria II, Springer: 371-385. 11 seas. Ecology of Cyanobacteria II, Springer: 371-385.

1988. Pigment and lipid 12 Volkman, J., Burton, H., Everitt, D. and Allen, D. 1988. Pigment and lipid

com positions of algal and bacterial communities in Ace Lake, Vestfold Hills, Antarctica. 13 compositions of algal and bacterial communities in Ace Lake, Vestfold Hills, Antarctica.

B iology of the Vestfold Hills, Antarctica, Springer: 41-57. 14 Biology of the Vestfold Hills, Antarctica, Springer: 41-57.

1999. Controls on soil biodiversity: insights from 15 Wall, D.H. and Virginia, R.A. 1999. Controls on soil biodiversity: insights from

:137-150. 16 extreme environments. Applied Soil Ecology 13:137-150.

17 Wang, N.F., Zhang, T., Zhang, F., Wang, E.T., He, J.F., Ding, H., Zhang, B.T., Liu,

2015. Diversity and structure of soil bacterial 18 J., Ran, X.B. and Zang, J.Y. 2015. Diversity and structure of soil bacterial

com munities in the Fildes Region (maritime Antarctica) as revealed by 454 19 communities in the Fildes Region (maritime Antarctica) as revealed by 454

: 1188 doi:10.3389/fmicb.2015.01188. 20 pyrosequencing. Frontiers in Microbiology 6: 1188 doi:10.3389/fmicb.2015.01188.

21 Ward, B., Devol, A., Rich, J., Chang, B., Bulow, S., Naik, H., Pratihary, A. and

2009. Denitrification as the dominant nitrogen loss process in the 22 Jayakumar, A. 2009. Denitrification as the dominant nitrogen loss process in the

:78-81. 23 Arabian Sea. Nature 461:78-81.

24 Wei, S.T., Fernandez-Martinez, M.-A., Chan, Y., Van Nostrand, J.D., de los

25 Rios-Murillo, A., Chiu, J.M., Ganeshram, A.M., Cary, S.C., Zhou, J. and Pointing,

2015. Diverse metabolic and stress-tolerance pathways in chasmoendolithic and 26 S.B. 2015. Diverse metabolic and stress-tolerance pathways in chasmoendolithic and

soil communities of Miers Valley, McMurdo Dry Valleys, Antarctica. Polar Biology 27 soil communities of Miers Valley, McMurdo Dry Valleys, Antarctica. Polar Biology

150

:433-443. 1 38:433-443.

1999. Osmotically induced intracellular trehalose, but 2 Welsh, D.T. and Herbert, R.A. 1999. Osmotically induced intracellular trehalose, but

. 3 not glycine betaine accumulation promotes desiccation tolerance in Escherichia coli.

:57-63. 4 FEMS Microbiology Letters 174:57-63.

5 Werner, J.J., Koren, O., Hugenholtz, P., DeSantis, T.Z., Walters, W.A., Caporaso,

2012. Impact of training sets on 6 J.G., Angenent, L.T., Knight, R. and Ley, R.E. 2012. Impact of training sets on

classification of high-throughput bacterial 16s rRNA gene surveys. The ISME Journal 7 classification of high-throughput bacterial 16s rRNA gene surveys. The ISME Journal

:94-103 8 6:94-103

1990. Amplification and direct 9 White, T.J., Bruns, T., Lee, S.J.W.T. and Taylor, J.W. 1990. Amplification and direct

sequencing of fungal ribosomal RNA genes for phylogenetics. PCR protocols: a guide 10 sequencing of fungal ribosomal RNA genes for phylogenetics. PCR protocols: a guide

:315-322 11 to methods and applications 18:315-322

2012. Ecology of cyanobacteria II: their diversity in space and time, 12 Whitton, B.A. 2012. Ecology of cyanobacteria II: their diversity in space and time,

Springer Science & Business Media. 13 Springer Science & Business Media.

2013. Microorganisms in desert rocks: 14 Wierzchos, J., de los Ríos, A. and Ascaso, C. 2013. Microorganisms in desert rocks:

:172-182. 15 the edge of life on Earth. International Microbiology 15:172-182.

2008. Sources of edaphic 16 Wood, S.A., Rueckert, A., Cowan, D.A. and Cary, S.C. 2008. Sources of edaphic

cyanobacterial diversity in the Dry Valleys of Eastern Antarctica. The ISME Journal 17 cyanobacterial diversity in the Dry Valleys of Eastern Antarctica. The ISME Journal

:308-320. 18 2:308-320.

2000. Antarctic ecosystems as models for 19 Wynn-Williams, D. and Edwards, H. 2000. Antarctic ecosystems as models for

:1065-1075. 20 extraterrestrial surface habitats.Planetary and Space Science 48:1065-1075.

21 Yakimov, M.M., Gentile, G., Bruni, V., Cappello, S., D'Auria, G., Golyshin, P.N.

2004. Crude oil-induced structural shift of coastal bacterial 22 and Giuliano, L. 2004. Crude oil-induced structural shift of coastal bacterial

com munities of rod bay (Terra Nova Bay, Ross Sea, Antarctica) and characterization of 23 communities of rod bay (Terra Nova Bay, Ross Sea, Antarctica) and characterization of

cultured cold-adapted hydrocarbonoclastic bacteria. FEMS Microbiology Ecology 24 cultured cold-adapted hydrocarbonoclastic bacteria. FEMS Microbiology Ecology

:419-432. 25 49:419-432.

26 Yamada, Y., Kuzuyama, T., Komatsu, M., Shin-ya, K., Omura, S., Cane, D.E. and

2015. Terpene synthases are widely distributed in bacteria. Proceedings of the 27 Ikeda, H. 2015. Terpene synthases are widely distributed in bacteria. Proceedings of the

151

:857-862. 1 National Academy of Sciences 112:857-862.

2016. Comparative 2 Yeoh, Y.K., Sekiguchi, Y., Parks, D.H. and Hugenholtz, P. 2016. Comparative

genom ics of candidate phylum TM6 suggests that parasitism is widespread and 3 genomics of candidate phylum TM6 suggests that parasitism is widespread and

:915-927. 4 ancestral in this lineage. Molecular Biology and Evolution 33:915-927.

5 Yergeau, E., Bokhorst, S., Huiskes, A.H., Boschker, H.T., Aerts, R., Kowalchuk,

2007. Size and structure of bacterial, fungal and nematode communities along an 6 G.A. 2007. Size and structure of bacterial, fungal and nematode communities along an

:436-451. 7 Antarctic environmental gradient. FEMS Microbiology Ecology 59:436-451.

2007. Functional 8 Yergeau, E., Kang, S., He, Z., Zhou, J. and Kowalchuk, G.A. 2007. Functional

m icroarray analysis of nitrogen and carbon cycling genes across an Antarctic latitudinal 9 microarray analysis of nitrogen and carbon cycling genes across an Antarctic latitudinal

:163-179. 10 transect. The ISME Journal 1:163-179.

2007. Patterns of 11 Yergeau, E., Newsham, K.K., Pearce, D.A. and Kowalchuk, G.A. 2007. Patterns of

bacterial diversity across a range of Antarctic terrestrial habitats. Environmental 12 bacterial diversity across a range of Antarctic terrestrial habitats. Environmental

:2670-2682. 13 Microbiology 9:2670-2682.

14 Yergeau, E., Schoondermark-Stolk, S.A., Brodie, E.L., Déjean, S., DeSantis, T.Z.,

2009. 15 Gonçalves, O., Piceno, Y.M., Andersen, G.L. and Kowalchuk, G.A. 2009. E nvironmental microarray analyses of Antarctic soil microbial communities. The ISME 16 Environmental microarray analyses of Antarctic soil microbial communities. The ISME

:340-351. 17 Journal 3:340-351.

2012. dbCAN: a web 18 Yin, Y., Mao, X., Yang, J.C., Chen, X., Mao, F. and Xu, Y. 2012. dbCAN: a web

resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Research 19 resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Research

:445-451. 20 40:445-451.

2012. Cultivation of uncultured bacteria of the class Ktedonobacter in the 21 Yokota, A. 2012. Cultivation of uncultured bacteria of the class Ktedonobacter in the

. 22 phylum Chloroflexi. MAKARA of Science Series 16.

http://journal.ui.ac.id/index.php/science/article/view/1273. Assessed on 9 June 2014 23 http://journal.ui.ac.id/index.php/science/article/view/1273. Assessed on 9 June 2014

2015. 24 Youssef, N.H., Couger, M., McCully, A.L., Criado, A.E.G. and Elshahed, M.S. 2015.

A ssessing the global phylum level diversity within the bacterial domain: A review. 25 Assessing the global phylum level diversity within the bacterial domain: A review.

:269-282. 26 Journal of Advanced Research 6:269-282.

27 Zarda, B., Hahn, D., Chatzinotas, A., Schönhuber, W., Neef, A., Amann, R.I. and

152

1 Zeyer, J. 1997. Analysis of bacterial community structure in bulk soil by in situ

:185-192. 2 hybridization.Archives of Microbiology 168:185-192.

3 Zeglin, L.H., Dahm, C.N., Barrett, J.E., Gooseff, M.N., Fitpatrick, S.K. and

2011. Bacterial community structure along moisture gradients in 4 Takacs-Vesbach, C.D. 2011. Bacterial community structure along moisture gradients in

the parafluvial sediments of two ephemeral desert streams. Microbial Ecology 5 the parafluvial sediments of two ephemeral desert streams. Microbial Ecology

:543-556. 6 61:543-556.

7 Zhalnina, K.V., Dias, R., Leonard, M.T., Dorr de Quadros, P., Camargo, F.A., Drew,

2014. Genome sequence of 8 J.C., Farmerie, W.G., Daroub, S.H. and Triplett, E.W. 2014. Genome sequence of

from group I.1b enriched from Everglades 9 Candidatus Nitrososphaera evergladensis from group I.1b enriched from Everglades

soil reveals novel genomic features of the ammonia-oxidizing archaea. PLoS One 10 soil reveals novel genomic features of the ammonia-oxidizing archaea. PLoS One

:e101648. 11 9:e101648.

2012. Ammonia-oxidizing archaea 12 Zhang, L.-M., Hu, H.-W., Shen, J.-P.and He, J.-Z. 2012. Ammonia-oxidizing archaea

have more important role than ammonia-oxidizing bacteria in ammonia oxidation of 13 have more important role than ammonia-oxidizing bacteria in ammonia oxidation of

:1032-1045. 14 strongly acidic soils. The ISME Journal 6:1032-1045.

2016. Dynamic roles of a secondary metabolite gene in the ecology of polar 15 Zhang, E. 2016. Dynamic roles of a secondary metabolite gene in the ecology of polar soils (Honours), University of New Sourth Wales. 16 soils (Honours), University of New Sourth Wales.

2008. Phylogenetic analysis of type I polyketide 17 Zhao, J., Yang, N. and Zeng, R. 2008. Phylogenetic analysis of type I polyketide

synthase and nonribosomal peptide synthetase genes in Antarctic sediment. 18 synthase and nonribosomal peptide synthetase genes in Antarctic sediment.

:97-105. 19 Extremophiles 12:97-105.

20 Zhou, J., Xia, B., Huang, H., Treves, D.S., Hauser, L.J., Mural, R.J., Palumbo, A.V.

2003. Bacterial phylogenetic diversity and a novel candidate division 21 and Tiedje, J.M. 2003. Bacterial phylogenetic diversity and a novel candidate division

:915-924. 22 of two humid region, sandy surface soils. Soil Biology Biochemistry 35:915-924.

23 Zucconi, L., Onofri, S., Cecchini, C., Isola, D., Ripa, C., Fenice, M., Madonna, S.,

2016. Mapping the lithic colonization at the 24 Reboleiro-Rivas, P. and Selbmann, L. 2016. Mapping the lithic colonization at the

:91-102. 25 boundaries of life in Northern Victoria Land, Antarctica. Polar Biology 39:91-102. 26

153

1 Appendices

2

154

1 Appendix 2.1 DistLM results showing environmental parameters that significantly

2 (p<0.05) explained the bacterial and fungal distributions

Bacteria Fungi

Pseudo- Proportion Pseudo- Proportion Variables Variables F explained F explained

Longitude 22.9 21% Longitude 21.865 20%

Total carbon (% w/w) 7.2579 6% Slope 9.4887 8%

Total carbon (% Latitude 5.205 4% 4.6577 4% w/w)

TIO2 (%) 2.9914 2% %Mud 2.8211 2%

Water content (%) 2.9657 2% Ca % of eCEC 2.4972 2%

eCEC (meq/100g Distance on 3.0627 2% 2.1698 2% DMB) transect (m)

Distance on transect 1.9183 1% P2O5 (%) 1.9468 1% (m)

Fe (mg/kg DMB) 1.8136 1% Mg % of eCEC 1.8019 1%

SO4 (mg/kg Al (mg/kg DMB) 1.787 1% 1.7779 1% DMB)

PO4 (mg/kg SO3 (%) 1.6008 1% 1.6381 1% DMB)

Conductivity %Mud 1.5054 1% 1.485 1% (uS/cm)

Phosphorus 1.4817 1% Ca (mg/kg DMB) 1.4008 1% Saturation Ratio

S (mg/kg DMB) 1.4791 1% pH 1.4934 1%

Minimum particle P (mg/kg DMB) 1.4636 1% 1.3486 1% size (m)

155

Ca (mg/kg DMB) 1.3974 1%

Mn (mg/kg DMB) 1.3299 1%

SO4 (mg/kg DMB) 1.3289 1%

Fe2O3 (%) 1.3168 1%

CaO (%) 1.2591 1%

Cl (mg/kg DMB) 1.2793 1%

Zn (mg/kg DMB) 1.2599 1%

Conductivity (uS/cm) 1.2521 1%

1

156

1 Appendix 3.1 Distribution of Robinson Ridge genes by KEGG categories.

0.2

0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0 Carbohydrate metabolism Lipid metabolism Xenobiotics biodegradation Transcription Energy Energy metabolism Nucleotide metabolism Amino acid metabolism Metabolism of other amino Metabolism of terpenoids secondary metabolites Translation Folding, andsorting Replication and repair Transport and catabolism Cell motility Cell growth and death Cellular commiunity Glycan Glycan biosynthesis and Metabolism of cofactors and Membrane transport Signal transduction Signaling molecules and Biosynthesis other of degradation and polyketides and metabolism metabolism interaction vitamins acids

2 3

157

1 Appendix 3.2 Hierarchical cluster analysis ofthe Robinson Ridge against17

2 publicly availablesoil metagenomesincluding those from grassland, desert, Arctic

3 tundra and volcanic soils.

B y comparing the relative abundance of KEGG categories, genes associated with 4 By comparing the relative abundance of KEGG categories, genes associated with

carbohydrate, lipid, polyketide, other amino acids and the metabolism of xenobiotics 5 carbohydrate, lipid, polyketide, other amino acids and the metabolism of xenobiotics

< 0.05) in Robinson Ridge. The black 6 were significantly over-represented (T-test, P < 0.05) in Robinson Ridge. The black

line indicates statistically significant clustering,with Robinson Ridge forming a separate 7 line indicates statistically significant clustering,with Robinson Ridge forming a separate

cluster (P <0.05) that was most similar to volcanic soils. 8 cluster (P <0.05) that was most similar to volcanic soils.

Soil group

9

158

1 Appendix 3.3 Key genes identified in the draft genomes involved in carbon fixation, and carbon, nitrogen, sulphur cycling.

A D3: candidate phylum AD3, Thauma: Thaumarchaeota, Actino: Actinobacteria, WPS: candidate phylum WPS-2, Alphapro: 2 AD3: candidate phylum AD3, Thauma: Thaumarchaeota, Actino: Actinobacteria, WPS: candidate phylum WPS-2, Alphapro:

A lpha-proteobacteria, Verruco: Verrucomicrobia, Meta: metagenome. 3 Alpha-proteobacteria, Verruco: Verrucomicrobia, Meta: metagenome. Category KEGG function AD3 Thauma Actino WPS Chloro Alphapro Verruco Meta ID K06281 hydrogenase large subunit 3 0 9 1 2 0 1 21 K06282 hydrogenase small subunit 2 0 7 1 1 0 1 21 K03518 carbon-monoxide dehydrogenase small 2 0 11 2 1 0 0 53 subunit K03519 carbon-monoxide dehydrogenase medium 5 0 7 1 1 0 0 44 subunit K03520 carbon-monoxide dehydrogenase large 7 0 8 0 0 0 0 83 subunit K02018 molybdate/tungstate transport system 2 0 11 1 2 1 2 30

Carbon fixationi Carbon fixationi permease protein K01601 RubisCO large subunit 2 0 6 1 1 0 0 22 K01602 RubisCO small subunit 2 0 7 1 0 0 0 14 K01673 molybdate/tungstate transport system 3 0 15 3 0 2 2 51 permease protein K02588 nifH, nitrogen fixation 0 0 0 0 0 0 0 0 K00367 narB, assimilatory nitrate reduction 0 0 1 0 0 0 0 2 K00372 nasA, assimilatory nitrate reduction 0 0 1 0 0 0 0 14 cycling

Nitrogen Nitrogen K00366 nirA, assimilatory nitrate reduction 1 0 2 0 0 0 0 3 K00362 nirB, dissimilatory nitrate reduction 0 0 7 0 0 1 0 19

159

K00370 narG, dissimilatory nitrate 0 0 1 0 0 0 0 7 reduction/denitrification K00368 nirK, denitrification 0 3 0 1 0 1 2 14 K00376 nosZ, denitrification 0 1 0 0 0 0 0 1 K10945 ammonia monooxygenase subunit C 0 1 0 0 0 0 0 1 ammonia oxidation K01179 Endoglucanase, cellulose->cellobiose 0 0 5 0 4 0 0 19 K05349 beta-glucosidase, 6 0 1 0 0 1 0 29 cellobiose->beta-D-glucose K05350 beta-glucosidase, 4 1 2 0 2 0 0 9 cellobiose->beta-D-glucose K00700 1,4-alpha-glucan branching enzyme, 5 0 11 1 2 0 0 33 Amylose <=> Starch K16149 1,4-alpha-glucan branching enzyme, 0 0 0 0 0 0 0 12 Amylose <=> Starch K00688 starch phosphorylase, Starch<=> 3 0 7 0 1 0 2 43 D-Glucose 1-phosphate K01176 alpha-amylase, Starch<=> Dextrin 0 0 2 0 0 0 0 5

Sugar utilisation utilisation Sugar K01178 Glucoamylase, Starch<=> 1 0 2 0 0 1 3 17 alpha-D-Glucose K01196 glycogen debranching enzyme, Starch<=> 0 0 0 0 0 0 0 12 alpha-D-Glucose K00705 4-alpha-glucanotransferase, Maltose<=> 3 0 6 0 0 0 1 21 alpha-D-glucose K12047 maltase-glucoamylase, Maltose<=> 0 0 0 0 0 0 0 0 alpha-D-glucose 160

K01187 alpha-glucosidase, Maltose<=> 3 0 11 0 2 0 0 36 alpha-D-glucose K15922 alpha-glucosidase, Maltose<=> 0 0 0 0 0 0 0 3 alpha-D-glucose K01051 Pectinesterase, Pectin<=> 0 0 0 0 0 0 1 1 Methanol+Pectate K01198 xylan 1,4-beta-xylosidase, xylan<=> 1 0 0 0 1 0 1 3 Xylose K01183 Chitinase, 0 0 1 0 0 0 0 4 Chitin<=>glucosamine/Chitobiose K00697 trehalose 6-phosphate synthase, TPS*TPP 3 0 10 0 1 1 1 31 pathway K16055 trehalose 6-phosphate 0 0 1 0 0 0 0 4 synthase/phosphatase, TPS*TPP pathway K01087 trehalose 6-phosphate phosphatase, 3 0 6 0 1 1 0 18 TPS*TPP pathway K05343 maltose alpha-D-glucosyltransferase/ 3 0 11 0 3 2 1 43 alpha-amylase, TS pathway K02438 glycogen operon protein, treX 3 0 16 0 1 0 2 40 K06044 (1->4)-alpha-D-glucan 4 0 9 0 0 0 2 27 1-alpha-D-glucosylmutase, treY Trehalose biosynthesis Trehalose K01236 maltooligosyltrehalose trehalohydrolase, 3 0 10 0 0 0 2 28 treZ K05342 alpha,alpha-trehalose phosphorylase, TreP 0 0 0 0 0 0 0 0 pathway 1 161

1 Appendix 3.4 Protein coding sequences of 3-hydroxypropionate carbon fixation pathway within the recovered thaumarchaeota

2 draft genome (bin 19).

T he sequences identified from Zhalnina et al., (2014) were used as reference sequence, and putative genes from bin 19 were compared with 3 The sequences identified from Zhalnina et al., (2014) were used as reference sequence, and putative genes from bin 19 were compared with

the genes identified from three ammonia oxidising Nitrosphaera, and two ammonia oxidising Nitrosopumilus genomes. 4 the genes identified from three ammonia oxidising Nitrosphaera, and two ammonia oxidising Nitrosopumilus genomes. putative genes involved in Reference Nitrososphaer Nitrososphaer Nitrososphaer Nitrosoarchaeu Nitrosopumilu carbon fixation NCBI a a viennensis a gargensis m limnia SFB1 s koreensis accession evergladensis EN7 Ga9.2 MY1 number SR1 birA, AIF84040 43% 45% 45% 35% 38% biotin-(acetyl-CoA-carboxyla se) acetyl-CoA carboxylase, AIF84813 80% 80% 80% 66% 65% carboxyltransferase component (subunits alpha and beta) Acetyl/propionyl-CoA AIF84814 72% 71% 71% 57% 57% carboxylase, alpha subunit (Biotin carboxylase) Acetyl/propionyl-CoA AIF84815 53% 54% 53% 39% 39% carboxylase, alpha subunit (Pyruvate carboxyl subunit B) methylmalonyl-CoA AIF83180 75% 75% 79% 49% 52% epimerase 162

Methylmalonyl-CoA AIF83181 74% 75% 77% 66% 65% N-terminal domain Methylmalonyl-CoA mutase AIF83184 78% 77% 78% 64% 64% C-terminal domain Fumarate hydratase AIF82837 69% 70% 71% 61% 62% (fumarase) succinate AIF82619 74% n.a. 34% 35% 35% dehydrogenase/fumarate reductase flavoprotein subunit succinate AIF82619 74% n.a. 36% 37% 37% dehydrogenase/fumarate reductase flavoprotein subunit succinate dehydrogenase and AIF83654 73% 69% 75% 54% 58% fumarate reductase iron-sulfur protein succinate dehydrogenase and AIF83654 69% 67% 71% 54% 55% fumarate reductase iron-sulfur protein succinate dehydrogenase, AIF83655 68% 70% 69% 52% 53% hydrophobic anchor subunit succinate dehydrogenase, AIF83655 71% 73% 80% 61% 60% hydrophobic anchor subunit succinate dehydrogenase AIF83656 75% 75% 78% 51% 51% subunit C 163 ubiquinone-dependent AIF83657 78% 78% 82% 70% 70% succinate dehydrogenase or fumarate reductase, flavoprotein subunit Malonyl-CoA reductase AIF85147 77% 76% 78% 58% 57% 3-hydroxyacyl-CoA AIF84511 71% 71% 74% 60% 59% dehydrogenase 3-hydroxypropionyl-CoA AIF84423 71% 71% 72% 61% 60% synthetase 1

2

164

1 Appendix 3.5 KEGG map of CBB carbon fixation pathway.

R ed blocks indicate genes identified in the Robinson Ridge metagenome. 2 Red blocks indicate genes identified in the Robinson Ridge metagenome.

3 4

165

1 Appendix 3.6 KEGG map of nitrogen metabolism.

R ed blocks indicate genes identified in the Robinson Ridge metagenome. 2 Red blocks indicate genes identified in the Robinson Ridge metagenome.

3 4

166

1 Appendix 3.7 taxonomy of methane monooxygenase identified from Robinson Ridge metagenome based on BLAST.

m mo: soluble methane momooxygenase; pmo: particulate methane monooxygenase 2 mmo: soluble methane momooxygenase; pmo: particulate methane monooxygenase Typ gene Protein sequences Taxonomy inferred closest similarit e by similarity sequence y mmoX VSRASITKAHDKIQELSWDPTYVTPVEKYPTDYTFEKAPKK Actinobacteria WP_028063546 92% DPLKQVLRSYFPMEEEKDNRVFGAMDGAIRGNMFRQVQE RWMEWQKLFLSIIPFPEISAARAMPLTIGVVPNPEVHNGLAI QMIDEVRHSTIQMNLKRLYMNHYIDPAGFDITTKGFQNCYA GTIGRQFGEGFITGDAITAANIYLTIVAETAFTNVLFVAMPGE AAANGDYLLPTVFHSVXPGAARA mmoX VDAAIGTFIEYGTKDRRKDRESYAEAWRRWIYDDYYRSYL Actinobacteria WP_028063546 90% VPLEKYGLTIPHDLVEESWNRIWNKGYIHETAQFFATGWWA NYWRIDGMDDTDFEWFEHKYPGWYDKYGKWWERYTDLS KKNGHAPICFDADTDYVYPHRCWTCMVPCMIREDXGGRR SRRPGPHVLLGDLPLDRRGGIPWRVQGPSDARHGQADRHA RVGDCLPRGRPR mmoY MAVETDKKKERSVPKPVFTDAEAGALTFPSSKSRSFNYFKA Actinobacteria WP_055494570 68% AKLHASLYEDVTVDVQPDPARHLTQGWVYGFAKGPGGFPE EWTKIKSSNWHAFLDPNEEWEQTIYRNNANVVRQITQNLA

soluble methane monooxygenase soluble methane monooxygenase NAKARQAYASWSTGWTRVVERHVGAWMHAEHGLGMHV FLPAQRDAPTNMINNAISVNSMHKLRFAQDLVLYNLEISGEI AGFDGSAHKDVWMNDPSWQGVRENVERLTAVRDWAEAVF AANYVFEGLVGELFRSQFVMQVAAPNGDYVTPTLMGAGE SDYERDLRYSRVLFKLLADDPKHGDANRTLMEKWLGQWV PMSLAAARKLQPIWSQPTEKAVRFEDSLAHTTERMRGHLD 167

EVEVRAPKELG mmoY VVFEPLVGELFRSQLVQHAAPRNGDFVTPTVVGAEEYDYA Actinobacteria WP_027935444 77% ERDLRYTRPMFELLTSDREFGDQNKAKLQEWLSVWTPRAI AAARTLQPLWSQPESKPPRFEDGLDAQKRRFSGILSDLHLE DPKELAQ mmoY MTTAPERSVPKPVFTDAEAGAKEFPDSTARRFNYYTPAKRK Actinobacteria WP_027935444 85% QTHYEDVTVEVQPDPRHYLSQGWLYGFSDGRGGYPLDWT ALKAWGSDRPEPERYPGSGGKGYDWPAHGWHEFRDPNEE WELTLYRYNANVVRQLNQNIDAARQAKAFEQWNQNWVR FVERNVGAWMHVDHGLGLYLFANANRRAPTNMHNNAISV NSMHRIRAAQDLALYNLTLSEEVDGFDGTAHLETWNSDPA WQGVREVAEQLTAIDDWAGAIFAANVVFEPLVGELFRSNLV QHAAPRNGDFVTPTVVGAAEYDYAERDLRYTRSMFELLTN DREFADHNKAILQQYLSDWVPRAISAARTLQPLWSQPDAK PPRFEDGLDQAKSRFSGIVTDLGLETPKELAQ pmoB MKPTMFSSLARQAGRLWALVLAAGLAVTMAGIGPADAHG alpha-Proteobacteria WP_051953405 67% EKSQAAFLRMRTLNWYDVKWSKTNVTVNEEYEITGKLHI (bin 15) MNAWPAAIEIPAQCFLNTGQPGAMAARLGVWVGGTFTPRS

MKLEVKTYAFRVLLKARRPGHWHTHVQLSVKTGGPIPGPG QYIDIKGNFSDYTDDVKLLNGTTVDIETYGISKIYMWHLFW IIVGGWWILYWFGKRGFIGRFAWVASGKAEEVITPQERMVG AITLVVLLVVIVFYAMTVSGNPNTIPLQAGDFHNIQALENEV DSGPITLKYLNGTYKVPGRELVANFKITNNGKEPVRIGEFNT monooxygenase particulate methane AGLRFLNPDVFTSKVEYPDYLLADRGLSLSDNSPIAPGETR DVVSVQDARWDTERLSGLAYDVDSSFAGVLFFFSPSGARYP MEVSGAVIPTFMPV 168 pmoB MNAWPAAIEVPERCFLDIRQPGPVANRLGVWVGGQFTPRP alpha-Proteobacteria WP_051953405 61% MKLELGKTYEFRILLKARRPGHWHTHVQLSVETGPIPGPGQ YIDIKGNFADFTDDVKSLNGTTVDLETYGQAKIYMWHLFW IIGGWWILYWFGKRGFVGRFAWVATGKAEDLITPQERVIGA LTLCGVLLVVIIFYAITVNNYPNTVPLQAGDFRNIQAIDGPEA VNGGRSISNT pmoB LNPEVYTTKVDYPHYLLAERGLSLSDNAPIQPGETKDIAVT alpha-Proteobacteria CAJ01618 72% SQDARWDTERLSGLANDVDSSFAGVISFSTPSGTSYRTEVG GGVITQGEGTTQKQPPDRTLNIYYATNRDADQSGIRLNYTS SADKLSFGIMQVHVPDNHRKGQVEYDPNFKLMSIQFKRDD LIEQEYMQHNFVIRGMVPLERSAFIALLGEDNRDTALVFVH GYNNSFSNGAFRLAQIVWDGQLWGSIPVLFS pmoA MLRDKSIKTGAVPAGESISASPGIEGGAGANAPVAAPNVGA alpha-Proteobacteria CAJ01617 71% TTAGHDHHAAAGSPFHSRAEAATAVHTADLLILTFLFLIMIG (bin 15) GYHVHAMLTMGDWDFWVDWKDRRMWPTVLPIMLVTFPA AAYFFWEHFRLPFGATFLCVALLFGEWLDRYISFWGWTFYP INLVWPTSLVPQALFLDIVLLLSKSFIVTAIVGSMGFSLLLYP NNWVILAQFHAPTEQYGTLMSLADVIGFHNVRTSMPEYIRI ERGTMRTFGKDVVGVASFFAGFVSIIVYFVWWFVGKMFST TKYMKSI pmoA MGDWDFWVDWKDRRMWPTVLPIMLVTFQFHAPTEQYGT alpha-Proteobacteria WP_036262482 74% LMSLADVIGFHNVRTSMPEYIRIIERGTMRTFGKDVVGVAA FFSGFVSIIVYFVWWYVGKLFSTVKYMKSI pmoC MSLVTGTARTGEAAAVAEAPLFNGMPLILGTIAINVFYVGV alpha-Proteobacteria CAJ01616 72% RIYEQVFGQFAGLDSFAPEFTTYWMTILYIEEPVELISFLALV GWMWKTRDMDVANVQPREEMRRVFNLISWIMMYGIAIY 169

WGASYFTEQDGTWHMTVIRDTDFTPSHIIEFYMSYPMYIVI GVGGFMYARTRLPTYACKGWSIAYVLLFVGPFMIFPNVGLN EWGHTFWFMEELFVEPLHWMFVFFGWFSLAVFGVTLQLIG RVVELAHGHEELLGLEPAE 1

2 3

170

1 Appendix 4.1 Sequence origin of predicted ORF within WPS-2 draft genomes

2 bin22 and bin23 Domain Phylum No. Of No. Of Hits 60% Hits 90% bin22 bin23 bin22 bin23 Archaea Euryarchaeota 3 1 Archaea Thaumarchaeota 1 Bacteria Acidobacteria 26 12 Bacteria Actinobacteria 35 22 Bacteria 1 1 Bacteria Armatimonadetes 9 4 Bacteria Bacteroidetes 4 4 Bacteria Candidatus 1 Bacteria Chloroflexi 44 31 1 Bacteria Cyanobacteria 11 6 Bacteria Deinococcus-Ther 5 3 mus Bacteria Fervidibacteria 1 1 Bacteria Firmicutes 101 86 2 Bacteria 1 Bacteria Gemmatimonadetes 3 1 Bacteria Ignavibacteriae 1 3 Bacteria Microgenomates 1 Bacteria 1 Bacteria 2 1 Bacteria Omnitrophica 1 Bacteria Planctomycetes 1 2 Bacteria Proteobacteria 68 47 Bacteria 1 Bacteria Synergistetes 1 Bacteria Thermotogae 2 Bacteria Verrucomicrobia 2 2 Bacteria WS1 1 Eukaryota Porifera 1 3 4

171

1 Appendix 4.2 predicted glycoside hydrolases within WPS-2 draft genomes bin22 and bin23 using dbCAN search. Draft Enzyme Known activity genome class id Bin22 GH92.hmm mannosyl-oligosaccharide α-1,2-mannosidase (EC 3.2.1.113); mannosyl-oligosaccharide α-1,3-mannosidase (EC 3.2.1.-); mannosyl-oligosaccharide α-1,6-mannosidase (EC 3.2.1.-); α-mannosidase (EC 3.2.1.24); α-1,2-mannosidase (EC 3.2.1.-); α-1,3-mannosidase (EC 3.2.1.-); α-1,4-mannosidase (EC 3.2.1.-); mannosyl-1-phosphodiester α-1,P-mannosidase (EC 3.2.1.-) Bin22 GH92.hmm mannosyl-oligosaccharide α-1,2-mannosidase (EC 3.2.1.113); mannosyl-oligosaccharide α-1,3-mannosidase (EC 3.2.1.-); mannosyl-oligosaccharide α-1,6-mannosidase (EC 3.2.1.-); α-mannosidase (EC 3.2.1.24); α-1,2-mannosidase (EC 3.2.1.-); α-1,3-mannosidase (EC 3.2.1.-); α-1,4-mannosidase (EC 3.2.1.-); mannosyl-1-phosphodiester α-1,P-mannosidase (EC 3.2.1.-) Bin22 GH125.hmm exo-α-1,6-mannosidase Bin23 GH125.hmm exo-α-1,6-mannosidase Bin23 GH125.hmm exo-α-1,6-mannosidase Bin23 GH125.hmm exo-α-1,6-mannosidase Bin23 GH125.hmm exo-α-1,6-mannosidase Bin22 GH125.hmm exo-α-1,6-mannosidase Bin23 GH130.hmm β-1,4-mannosylglucose phosphorylase (EC 2.4.1.281); β-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319); β-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320); β-1,2-mannobiose phosphorylase (EC 2.4.1.-); β-1,2-oligomannan phosphorylase (EC 2.4.1.-); β-1,2-mannosidase (EC 3.2.1.-) Bin22 GH130.hmm β-1,4-mannosylglucose phosphorylase (EC 2.4.1.281); β-1,4-mannooligosaccharide phosphorylase (EC 2.4.1.319); β-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320); β-1,2-mannobiose phosphorylase (EC 2.4.1.-); β-1,2-oligomannan phosphorylase (EC 2.4.1.-); β-1,2-mannosidase (EC 3.2.1.-)

172

Bin22 GH109.hmm α-N-acetylgalactosaminidase Bin23 GH109.hmm α-N-acetylgalactosaminidase Bin23 GH20.hmm β-hexosaminidase (EC 3.2.1.52); lacto-N-biosidase (EC 3.2.1.140); β-1,6-N-acetylglucosaminidase) (EC 3.2.1.-); β-6-SO3-N-acetylglucosaminidase (EC 3.2.1.-) Bin23 GH20.hmm β-hexosaminidase (EC 3.2.1.52); lacto-N-biosidase (EC 3.2.1.140); β-1,6-N-acetylglucosaminidase) (EC 3.2.1.-); β-6-SO3-N-acetylglucosaminidase (EC 3.2.1.-) Bin23 GH109.hmm α-N-acetylgalactosaminidase Bin22 GH109.hmm α-N-acetylgalactosaminidase Bin22 GH109.hmm α-N-acetylgalactosaminidase

Bin22 GH3.hmm β-glucosidase (EC 3.2.1.21); xylan 1,4-β-xylosidase (EC 3.2.1.37); β-glucosylceramidase (EC 3.2.1.45); β-N-acetylhexosaminidase (EC 3.2.1.52); α-L-arabinofuranosidase (EC 3.2.1.55); glucan 1,3-β-glucosidase (EC 3.2.1.58); glucan 1,4-β-glucosidase (EC 3.2.1.74); isoprimeverose-producing oligoxyloglucan (EC 3.2.1.120); coniferin β-glucosidase (EC 3.2.1.126); exo-1,3-1,4-glucanase (EC 3.2.1.-); β-N-acetylglucosaminide phosphorylases (EC 2.4.1.-) Bin23 GH3.hmm β-glucosidase (EC 3.2.1.21); xylan 1,4-β-xylosidase (EC 3.2.1.37); β-glucosylceramidase (EC 3.2.1.45); β-N-acetylhexosaminidase (EC 3.2.1.52); α-L-arabinofuranosidase (EC 3.2.1.55); glucan 1,3-β-glucosidase (EC 3.2.1.58); glucan 1,4-β-glucosidase (EC 3.2.1.74); isoprimeverose-producing oligoxyloglucan hydrolase (EC 3.2.1.120); coniferin β-glucosidase (EC 3.2.1.126); exo-1,3-1,4-glucanase (EC 3.2.1.-); β-N-acetylglucosaminide phosphorylases (EC 2.4.1.-) Bin22 GH13.hmm α-amylase (EC 3.2.1.1); pullulanase (EC 3.2.1.41); cyclomaltodextrin glucanotransferase (EC 2.4.1.19); cyclomaltodextrinase (EC 3.2.1.54); trehalose-6-phosphate hydrolase (EC 3.2.1.93); oligo-α-glucosidase (EC 3.2.1.10); maltogenic amylase (EC 3.2.1.133); neopullulanase (EC 3.2.1.135); α-glucosidase (EC 3.2.1.20); maltotetraose-forming α-amylase (EC 3.2.1.60); isoamylase (EC 3.2.1.68); glucodextranase (EC 3.2.1.70); maltohexaose-forming α-amylase (EC 3.2.1.98); maltotriose-forming α-amylase (EC 3.2.1.116); branching enzyme (EC 2.4.1.18); trehalose synthase (EC 5.4.99.16); 4-α-glucanotransferase (EC 2.4.1.25); 173

maltopentaose-forming α-amylase (EC 3.2.1.-) ; amylosucrase (EC 2.4.1.4) ; sucrose phosphorylase (EC 2.4.1.7); malto-oligosyltrehalose trehalohydrolase (EC 3.2.1.141); isomaltulose synthase (EC 5.4.99.11); malto-oligosyltrehalose synthase (EC 5.4.99.15); amylo-α-1,6-glucosidase (EC 3.2.1.33); α-1,4-glucan: phosphate α-maltosyltransferase (EC 2.4.99.16); 6?-P-sucrose phosphorylase (EC 2.4.1.-); amino acid transporter Bin23 GH15.hmm glucoamylase (EC 3.2.1.3); glucodextranase (EC 3.2.1.70); α,α-trehalase (EC 3.2.1.28); dextran dextrinase Bin22 GH15.hmm glucoamylase (EC 3.2.1.3); glucodextranase (EC 3.2.1.70); α,α-trehalase (EC 3.2.1.28); dextran dextrinase 1

2

174

1 Appendix 4.3 Comparison of cell envelop biosynthesis related genes between recovered WPS-2 genomes and a range of bacteria

2 and archaea.

H eat map showed the presence or absence of membrane and cell wall biosynthesis related genes that are differentially present in monoderm 3 Heat map showed the presence or absence of membrane and cell wall biosynthesis related genes that are differentially present in monoderm

and diderm prokaryotic cells. The cluster analysis showed the separation of monoderm and diderm bacteria as well as archaeal. Dashed line 4 and diderm prokaryotic cells. The cluster analysis showed the separation of monoderm and diderm bacteria as well as archaeal. Dashed line

indicates the clustering not statistically significant, while the solid line indicates the clustering was confirmed by similarity profile analysis. 5 indicates the clustering not statistically significant, while the solid line indicates the clustering was confirmed by similarity profile analysis.

6

175

1 Appendix 5.1 Transformation of environmental parameters to reduce skewness of 2 data. C EC: cation exchange capacity; CECe: effective cation exchange capacity 3 CEC: cation exchange capacity; CECe: effective cation exchange capacity Not transformed Log transformed 1/(1+x) transformed Latitude dry matter fraction Slope Elevation Al as percentage of eCEC pH Aspect water extractable NO3 conductivity water extractable SO4 water extractable NH4 Mud percentage Sand percentage Gravel percentage Minimum particle size Kurtosis Mean particle size TiO2 Quartile Deviation Al2O3 (PQD) Sorting Coeff MnO Graphic Skewness CaO SiO2 SO3 Fe2O3 Total nitrogen MgO Total phosphorous Na2O Mg as percentage of CECe Cl B as percentage of CECe TC Cu as percentage of CECe S as percentage of Fe as percentage of CECe CECe 2M KCl extractable Al Mn as percentage of CECe Mg as percentage of eCEC CECe K as percentage of Ca as percentage of CECe CECe Na as percentage of CECe 4 5

176

1 Appendix 5.2 Environmental parameters that best explained the distribution of 2 dominant WPS-2 OTUS in Arctic and Eastern Antarctica based on DistLM 3 analysis. Variable Adj R2 Pseudo-F Proportion of variation explained Elevation 0.20212 21.266 0.2121 %Mud 0.33089 16.204 0.13553 CEC Fe 0.41986 12.962 9.40E-02 Slope 0.478 9.5758 6.25E-02 CEC Mg 0.50024 4.3824 2.74E-02 Latitude 0.52605 5.0845 3.01E-02 Na2O 0.53941 3.1464 1.81E-02 CEC Al 0.55135 2.9433 1.65E-02 PQD 0.56152 2.6687 1.46E-02 CEC S 0.57095 2.5612 1.37E-02 TC 0.5801 2.5246 1.33E-02 Aspect 0.59256 3.11 1.58E-02 TKN 0.60201 2.6158 1.30E-02 dry matter 0.60884 2.1701 1.06E-02 fraction Graphic 0.62161 2.1531 1.02E-02 Skewness 4 5 6

177