Antarctic Biodiversity Surveys Using High Throughput Sequencing: Understanding Landscape and Communities of the Prince Charles Mountains

Thesis submitted in fulfilment of the requirements for Degree of Doctor of Philosophy Antarctic biodiversity surveys using high throughput sequencing: understanding landscape and communities of the Prince Charles Mountains Paul Czechowski December 2015 School of Biological Sciences Difficulties are just things to overcome, after all - Ernest Shackleton Contents Abstract v Publications, Presentations, Awards vii Thesis Declaration ix Acknowledgments x Chapter Format xi 1. Antarctic terrestrial biodiversity and environmental metagenetics 3 1.1. Introduction . 4 1.2. Technical considerations . 6 1.2.1. Extraction of environmental samples . 6 1.2.2. High throughput platforms . 7 1.2.3. Marker choice . 8 1.2.4. Library generation . 8 1.2.5. Amplification . 9 1.2.6. Sequence analysis . 10 1.2.7. Recent improvements of metagenetic HTS approaches . 11 1.3. The potential of metagenetics for Antarctic biology . 15 1.3.1. Community structures . 15 1.3.2. Geographic distribution of Antarctic biota . 16 1.3.3. Supporting conservation efforts . 17 1.4. Summary and conclusions . 17 1.5. Acknowledgements . 18 1.6. Authors contributions . 18 2. Eukaryotic soil diversity of the Prince Charles Mountains 21 2.1. Introduction . 22 i Contents 2.2. Methods and materials . 25 2.2.1. Fieldwork, soil storage and DNA extraction . 25 2.2.2. Amplification and library generation . 26 2.2.3. Read processing . 26 2.2.4. Eukaryotic a and b diversity comparison . 27 2.2.5. Distribution of phylotypes across sites . 28 2.2.6. Species-level assignment of phylotypes . 30 2.3. Results . 31 2.3.1. Read processing . 31 2.3.2. Eukaryotic a and b diversity comparison . 31 2.3.3. Distribution of phylotypes across sites . 33 2.3.4. Species-level assignment of phylotypes . 34 2.4. Discussion . 35 2.4.1. Technical considerations . 35 2.4.2. Differences in eukaryotic diversity among three locations . 36 2.4.3. Distribution of highly abundant phylotypes . 37 2.4.4. Validity of species-level taxonomic assignments . 38 2.5. Summary and conclusions . 39 2.6. Acknowledgements . 39 2.7. Supplemental data . 40 2.8. Supplemental information . 41 2.8.1. Methods and Materials . 41 2.8.2. Results and comments . 50 3. Phylotypes and morphotypes of Antarctic invertebrates 59 3.1. Introduction . 60 3.2. Methods . 63 3.2.1. Samples . 63 3.2.2. DNA extractions . 64 3.2.3. Primers . 65 3.2.4. Amplification and sequencing . 66 3.2.5. Reference data for taxonomic assignments . 66 3.2.6. Generation of phylotype observations . 67 3.2.7. Selection of processing parameters for 18S and COI phylotypes 68 3.2.8. Concordance between phylotypes and morphotypes . 69 ii Contents 3.3. Results . 71 3.3.1. Selection of analysis parameters . 71 3.3.2. Concordance between morphotypes and phylotypes . 71 3.4. Discussion . 72 3.4.1. Analysis parameters . 74 3.4.2. Detecting cryptic invertebrates . 75 3.4.3. Metagenetic marker choice for Antarctic invertebrates . 76 3.5. Conclusions . 77 3.6. Acknowledgements . 78 3.7. Supporting information . 78 3.8. Supplemental information . 79 3.8.1. Methods and Materials . 79 3.8.2. Tables and Figures . 83 3.8.3. Analysis code . 85 4. Salinity gradients determine invertebrate distribution 90 4.1. Introduction . 91 4.2. Methods . 94 4.2.1. Fieldwork . 94 4.2.2. Soil geochemical and mineral analysis . 94 4.2.3. Preparation and analysis of environmental observations . 97 4.2.4. Preparation and analysis of biological observations . 98 4.2.5. Constrained ordination . 99 4.3. Results . 99 4.3.1. Environmental data . 99 4.3.2. Biological data . 100 4.3.3. Biological data in relation to environment . 100 4.4. Discussion . 101 4.5. Conclusion . 105 4.6. Data accessibility . 106 4.7. Authors contributions . 107 4.8. Acknowledgements . 107 4.9. Funding . 108 4.10. Supplemental information . 109 4.10.1. Phylotype data generation for 18S and COI . 109 iii Contents 4.10.2. Sequence tag selection and amplicon orientations . 110 4.10.3. Intermediate results of environmental data processing . 111 4.10.4. Intermediate results of biological data processing . 111 4.10.5. Intermediate results of biological data in relation to environment111 4.10.6. Data and analysis scripts, additional figures and tables . 112 5. Synthesis 130 5.1. Summary . 130 5.1.1. Technical and computational methods . 130 5.1.2. Biodiversity information from the Prince Charles Mountains . 130 5.2. High throughput sequencing for Antarctica . 132 5.3. Implications and future improvements . 135 5.4. Conclusion . 137 5.5. Acknowledgements . 138 A. Phylotype information chapter 2 139 B. Analysis code chapter 3 162 C. Analysis code chapter 4 190 D. Molecular tagging of amplicons 236 Bibliography 244 iv Abstract Antarctic soils are home to small, inconspicuous organisms including bacteria, uni- cellular eukaryotes, fungi, lichen, cryptogamic plants and invertebrates. Antarctic soil communities are distinct from other soil biota as a consequence of long-term persistence under harsh environmental conditions; furthermore their long history of isolation is responsible for a high degree of endemism. Of major concern is the establishment of non-indigenous species facilitated by human-mediated climate change and increased human activity, threatening the highly specialised endemic species. A lack of baseline information on terrestrial Antarctic biodiversity currently impairs efforts to conserve the unique but still largely unknown Antarctic biota. In this thesis I apply metagenetic high throughput sequencing (MHTS) methods to address the deficiency of biological information from remote regions of continental Antarctica, and use the data generated to explore environmental constraints on Antarctic biodiversity. In Chapter 1, I introduce current issues impeding the generation of baseline Antarctic biodiversity data and evaluate the application of using MHTS techniques. This review highlights the potential of using MHTS approaches using amplicon sequencing to retrieve Eukaryotic biodiversity information from terrestrial Antarctica. In Chapter 2, the eukaryotic diversity of three biologically unsurveyed regions in the Prince Charles Mountains, East Antarctica (PCMs) is explored. Total eukaryote biodiversity in the PCMs appears to follow an altitudinal or latitudinal trend, which is less obvious for terrestrial invertebrates. In order to apply MHTS to the study of Antarctic invertebrates, the comparative taxonomic assignment fidelities of metagenetic markers and morphological approaches are explored in Chapter 3. Fidelities of taxonomic assignments to four Antarctic invertebrate phyla differed depending on metagenetic marker, and only application of non-arbitrary sequence processing parameters resulted in these findings. In Chapter 4, I use MHTS-derived biodiversity information to explore the relationship between soil properties and invertebrate biodiversity in the PCMs. Across large spatial scales distribution of phyla Tardigrada and Arachnida and v classes Enoplea (Nematoda) and Bdelloidea (Rotifera) in inland areas are constrained by terrain-age-related accumulation of salts, while other Classes (Chromadorea, Nematoda and Monogonata, Bdelloidea) are better able to tolerate high salinity. In moister, nutrient-richer and more coastal areas, this effect was less pronounced and a higher invertebrate diversity was found. The methods applied and developed in this thesis are a valuable starting point to advance the collection of biodiversity information across terrestrial Antarctica and other remote habitats. The work presented here provides examples for generation and usage of MHTS information from remote Antarctic habitats, demonstrates how biodiversity information retrieved using different metagenetic markers can be combined, developed methods for assessing the quality of MHTS markers and finally demonstrated the application of MHTS data to investigate the environmental determinants of invertebrate diversity in remote ice-free habitats. Future MHTS biodiversity studies of Antarctic terrestrial habitats should incorporate large sample numbers and use combined data from multiple genetic markers. vi Publications, presentations and awards Publications 2013 Laurence J. Clarke, Paul Czechowski, Julien Soubrier, Mark I. Stevens, Alan Cooper (2014). Modular tagging of amplicons using a single PCR for high-throughput sequencing. Molecular Ecology Resources. 14, 117 121.1 Conference presentations 2014 2014 SCAR Open Science Conference (SCAR, SkyCity, Auckland, New Zealand): Paul Czechowski, Duanne White, Laurence J. Clarke, Alan Cooper, Mark I. Stevens (April): High-throughput DNA sequencing reveals Antarctic soil biodiversity. 2014 Understanding Biodiversity Dynamics using diverse Data Sources (CBA, Australian National University, Canberra, Australia): Paul Czechowski, Duanne White, Laurence J. Clarke, Alan Cooper, Mark I. Stevens (April): High-throughput DNA sequencing reveals Antarctic soil biodiversity. 2013 Biodiversity Genomics Conference (CBA, Australian National University, Canberra, Australia): Paul Czechowski, Laurence J. Clarke, Alan Cooper, Mark I. Stevens (April): Exploring Distribution and Evolution of Antarctic Invertebrates using High-Throughput Sequencing. 1See Appendix D vii Awards 2014 Australian Government’s National Taxonomy Research Grant Program - Student Travel Grant, Australian Biological Resources Study, Government of Australia, $ 1 650- travel support for conference attendance. 2013 MiSeq Pilot the Possibilities Grant Program, Illumina Australia, One library reagents kit for experimental libraries. 2012 Small Research Grants Scheme,

Antarctic Biodiversity Surveys Using High Throughput Sequencing: Understanding Landscape and Communities of the Prince Charles Mountains

ARTHROPOD COMMUNITIES and PASSERINE DIET: EFFECTS of SHRUB EXPANSION in WESTERN ALASKA by Molly Tankersley Mcdermott, B.A./B.S

Cravens Peak Scientific Study Report

DNA Barcoding of the Fire Ant Genus Solenopsis Westwood

Tachinid (Diptera: Tachinidae) Parasitoid Diversity and Temporal Abundance at a Single Site in the Northeastern United States Author(S): Diego J

To Us Insectometers, It Is Clear That Insect Decline in Our Costa Rican

Spiders (Araneae) of Churchill, Manitoba: DNA Barcodes And

Araneae: Hahniidae

About the Book the Format Acknowledgments

Potential of Oribatid Mites in Biodegradation and Mineralization for Enhancing Plant Productivity

Glaciomarine Sedimentation at the Continental Margin of Prydz Bay, East Antarctica: Implications on Palaeoenvironmental Changes During the Quaternary

Hotspots of Mite New Species Discovery: Sarcoptiformes (2013–2015)

Araneae: Sparassidae)