Establishing Optimum DNA Annotation Methods to Investigate the Impacts of Flooding on Microbial Communities and Functions

Establishing optimum DNA annotation methods to investigate the impacts of flooding on microbial communities and functions Richard James Randle-Boggis PhD University of York Biology June 2016 Abstract Environmental change will have significant impacts on microbial ecosystems. Microorganisms dominate most biogeochemical pathways, and environmental perturbations may alter these functions. Such functions include nutrient cycling, pollution abatement and greenhouse gas emission, and it is paramount that the impact of environmental change on ecosystems is understood. High throughput DNA sequencing provides a window into complex microbial communities and their functional potential, thus allowing us to empirically study how such communities respond to predicted future environments. There are, however, caveats and challenges associated with such technologies, particularly with converting billions of sequencing base calls into species and function counts. This thesis firstly quantifies the performances of sequence annotation tools and parameters using a simulated metagenome. It is found that tools differ in performance, and that parameter selection can significantly reduce annotation accuracy e.g. One Codex correctly annotated many sequences at the genus level, whereas MG-RAST RefSeq produced many false positive annotations. The results provide a guideline to quantitatively inform researchers about the impacts of certain choices on annotation performance, and show that some published studies may be drawing incorrect conclusions. This thesis also investigates the impacts of increased flooding frequency and duration on soil microbial ecosystems, in line with predicted climate change. Increased frequency has significant impacts on biodiversity, community composition and potential function. SkyLine, a novel, continuous gas flux measuring system, was used to record CO2 and CH4 fluxes. Increased flooding duration significantly reduced CH4 oxidation and increased CO2 assimilation, with the combined global warming potential of these gasses reduced. 2 Contents Abstract ............................................................................................................................... 2 Contents ............................................................................................................................... 3 List of Tables ...................................................................................................................... 7 List of Figures .................................................................................................................... 9 Acknowledgements ....................................................................................................... 12 Author’s declaration ..................................................................................................... 13 1 Introduction ................................................................................................................. 14 1.1 Environmental change and flooding ................................................................... 14 1.1.1 Climate change ................................................................................................................. 14 1.1.2 Microbial ecosystems .................................................................................................... 15 1.1.3 Methanogenesis and methane oxidation .............................................................. 15 1.2 Metagenomics.............................................................................................................. 17 1.2.1 Overview ............................................................................................................................ 17 1.2.2 DNA sequencing .............................................................................................................. 19 1.2.2.1 Sanger sequencing ..................................................................................................................... 20 1.2.2.2 454 Pyrosequencing .................................................................................................................. 21 1.2.2.3 Ion Torrent .................................................................................................................................... 22 1.2.2.4 Illumina sequencing .................................................................................................................. 22 1.2.2.5 Nanopore sequencing ............................................................................................................... 25 1.2.3 Amplicons, metagenomes or whole genomes? .................................................. 25 1.2.4 Analysis ............................................................................................................................... 27 1.2.5 Environmental applications ....................................................................................... 30 1.2.6 Alternative methods ...................................................................................................... 32 1.3 Overview and aims .................................................................................................... 32 2 Evaluating techniques for metagenome annotation using simulated sequence data. ................................................................................................................. 34 2.1 Abstract ......................................................................................................................... 34 2.2 Introduction ................................................................................................................. 34 2.2.1 Databases ........................................................................................................................... 36 2.2.2 Parameters ........................................................................................................................ 37 2.2.3 Aims ...................................................................................................................................... 38 3 2.3 Methodology ................................................................................................................. 38 2.3.1 Metagenome simulation .............................................................................................. 38 2.3.2 Analysis ............................................................................................................................... 39 2.4 Results ............................................................................................................................ 41 2.4.1 Simulation and annotation ......................................................................................... 41 2.4.2 Parameters (Blast and MG-RAST) ........................................................................... 41 2.4.3 Annotation sensitivity and precision ..................................................................... 49 2.4.4 Taxa abundance correlations .................................................................................... 52 2.4.5 Taxa richness .................................................................................................................... 54 2.5 Discussion ..................................................................................................................... 57 3 The effects of increased flooding frequency on a laboratory controlled microbial ecosystem. .................................................................................................... 62 3.1 Abstract .......................................................................................................................... 62 3.2 Introduction ................................................................................................................. 62 3.2.1 Climate change and flooding ...................................................................................... 62 3.2.2 Flooding and microbial ecosystems ....................................................................... 62 3.2.3 Hypotheses ........................................................................................................................ 64 3.3 Methodology ................................................................................................................. 64 3.3.1 Experimental design ..................................................................................................... 64 3.3.2 Treatment .......................................................................................................................... 65 3.3.3 DNA sampling .................................................................................................................. 66 3.3.4 Sequencing ........................................................................................................................ 67 3.3.5 Analyses.............................................................................................................................. 67 3.4 Results ............................................................................................................................ 68 3.4.1 Sequencing ........................................................................................................................ 68 3.4.2 Diversity and Bacteria:Archaea ratio ..................................................................... 72 3.4.3 Sample dissimilarities .................................................................................................

Establishing Optimum DNA Annotation Methods to Investigate the Impacts of Flooding on Microbial Communities and Functions

Whole Genome Sequencing and Comparative Genomic Analysis Of

GALA, a Database for Genomic Sequence Alignments and Annotations

ANSWER KEY Sybsc. Life Sciences- SEM

Annotating a Non-Model Plant Genome – a Study on the Narrow-Leafed Lupin

University of Copenhagen

Using High-Resolution Annotation of Insect Mitochondrial DNA to Decipher Tandem Repeats in the Control Region

A Genetrek Analysis of the Maize Genome

Comprehensive Genomic Analysis of an Indigenous Pseudomonas Pseudoalcaligenes Degrading Phenolic Compounds

Automated Paleontology of Repetitive DNA with Reanno- Tate

AI in Healthcare, Animal Husbandry & Plant Genomics Hello!

A Review of Various Methods Used in the Analysis of Functional Gene Expression Data

Identification and DNA Annotation of a Plasmid Isolated From