Generation Sequencing”, Is Purely My Own Unless Stated Otherwise
Total Page:16
File Type:pdf, Size:1020Kb
Unraveling evolution through Next Generation Sequencing Michael Westbury Univ.-Diss. zur Erlangung des akademischen Grades "doctor rerum naturalium" (Dr. rer. nat.) in der Wissenschaftsdisziplin "Evolutionsgenetik." eingereicht an der Mathematisch-Naturwissenschaftlichen Fakultät Institut für Biochemie und Biologie der Universität Potsdam Hauptbetreuer: Prof. Michael Hofreiter (University of Potsdam) weitere Gutachter: Associate Prof. Eline Lorenzen (Natural history museum of Denmark) and Prof. Jon Waters (University of Otago, New Zealand) Published online at the Institutional Repository of the University of Potsdam: URN urn:nbn:de:kobv:517-opus4-409981 http://nbn-resolving.de/urn:nbn:de:kobv:517-opus4-409981 Declaration of authorship I, Michael Westbury, declare that the work contained within this thesis titled, “Unraveling evolution through Next Generation Sequencing”, is purely my own unless stated otherwise. Chapter 2: “Complete mitochondrial genome of a bat eared fox (Otocyon megalotis), along with phylogenetic considerations” This manuscript was published in “Mitochondrial DNA part B” in 2017 and can be found online at http://dx.doi.org/10.1080/23802359.2017.1331325. I carried out the extraction lab work, data analyses and wrote the manuscript. Chapter 3: “A mitogenomic timetree for Darwin‘s enigmatic South American mammal Macrauchenia patachonica” This manuscript was published in “Nature communications” in 2017 under the doi 10.1038/ncomms15951. I carried out the majority of the lab work, including the sample that yielded DNA. Sina Baleka processed some samples that were unsuccessful in yielding DNA. I performed the majority of the data analyses assisted by Axel Barlow, Sina Baleka, Stefanie Hartmann, and Johanna L.A. Paijmans. I wrote the majority of the methods and results in the manuscript. I further supplied input in the introduction and discussion of the manuscript together with Ross D.E. MacPhee and Michael Hofreiter. Chapter 4: “Population and conservation genomics of the world's rarest hyena species, the brown hyena (Parahyena brunnea)” This manuscript has been submitted to “Genome research” and is also available online in the preprint server “Biorxiv” under https://doi.org/10.1101/170621. I conceived the project idea together with Michael Hofreiter. I performed all lab work excluding the two captive samples (brown and striped hyena). Love Dalén and the National Genomics Infrastructure (Stockholm) processed the two captive samples. I performed all analyses apart from the sliding trees which Stefanie Hartmann performed. I wrote the majority of the manuscript. Signed: Michael Westbury Acknowledgements Firstly, I would like to acknowledge and thank my supervisor, Prof. Michael Hofreiter, for giving me the opportunity to work on such interesting and state of art projects. Without your guidance and support, I am sure I would have been lost. I really appreciated the chances you gave me to travel all around the world, meet new people and learn new skills on the work budget. Furthermore, I would like to thank …. All of the collaborators on the projects presented within this thesis. The Evolutionary Adaptive Genomics group, present and past members. We have grown so significantly over the years that there are too many to name. Stefanie Hartmann for being the speediest proof reader of time and giving some valuable feedback on the thesis. The Potsdam Porcupines for giving me something to do apart from work. The Fellas over fame crew for supplying good banter in times of need and boredom and all of the lads around the world. A special thanks to my family, especially mum and dad who encouraged me to leave the homeland in the pursue of greater things. I am grateful that even when what I was doing seemed a bit abstract, you always were or at least pretended to be interested. Last but not least, I would like to thank the lovely Binia De Cahsan. You helped me through the nightmare of the German bureaucratic system, supplied ample German language knowledge and put up with all of my rubbish. I greatly appreciate all the time and help you give me and wouldn’t have wanted it any other way. Table of contents Summary 1 Zusammenfassung 3 Chapter 1: Introduction 6 1.1 Traditional DNA sequencing 7 1.2 Next generation sequencing 8 1.2.1 Illumina technologies 9 1.3 Applications of NGS 10 1.4 Benefits of NGS within evolutionary biology 11 1.5 Ancient DNA 12 1.6 Data analyses 14 1.6.1 De novo assemblies 15 1.6.2 Mapping assemblies 15 1.6.2.1 Low coverage mapping 16 1.6.2.2 Iterative mapping 16 1.7 Thesis outline 17 Chapter 2: Article I 20 2.1 Abstract 20 2.2 Main text 20 Chapter 3: Article II 24 3.1 Abstract 24 3.2 Introduction 25 3.3 Results 27 3.3.1 Sample screening 27 3.3.2 Validation of the iterative mapping approach using MITObim 27 3.3.3 Mitochondrial genome reconstruction of MAC002 28 3.3.4 Macrauchenia mitochondrial sequence validation 29 3.3.5 Phylogenetic reconstruction 30 3.4 Discussion 31 3.5 Methods 33 3.5.1 Samples 33 3.5.2 DNA preparation 33 3.5.3 Test sequencing and analysis 36 3.5.4 Further extractions of MAC002 36 3.5.5 Deep sequencing of MAC002 37 3.5.6 Mitochondrial reconstruction 37 3.5.7 MITObim validation 38 3.5.8 MAC002 mitochondrial genome reconstruction 38 3.5.9 Final sequence validation 39 3.5.10 Retrospective mapping of other Macrauchenia and Toxodon samples 40 3.5.11 Phylogenetic reconstruction 40 Chapter 4: Article III 47 4.1 Abstract 47 4.2 Introduction 48 4.3 Results 51 4.3.1 Genome reconstructions 51 4.3.2 Genetic diversity 51 4.3.3 Demographic history 55 4.3.4 Population structure 56 4.4 Discussion 61 4.4.1 Genomic diversity 62 4.4.2 Brown hyena population structure 63 4.4.3 Conservation implications 64 4.5 Methods 64 4.5.1 Samples 64 4.5.2 Striped hyena de novo assembly 65 4.5.3 Captive brown hyena sample 65 4.5.4 Wild caught brown hyena samples 66 4.5.5 Raw data treatment 66 4.5.6 Mitochondrial genome reconstructions 66 4.5.7 Mitochondrial analyses 67 4.5.8 Low coverage nuclear genome analyses 67 4.5.9 Brown hyena population structure 68 4.5.10 Comparative population structures 69 4.5.11 Species heterozygosity estimates 69 4.5.12 Demographic inference 70 Chapter 5: Discussion 72 5.1 Aims and importance of the thesis 72 5.2 Evolutionary insights through NGS 73 5.3 Conservation 75 5.4 Bioinformatic advances 76 5.5 NGS in the future 77 5.6 General conclusions 79 Bibliography 80 Appendix A 92 Appendix B 117 List of abbreviations Bp - base pairs Kya - thousand years ago Ma - million years ago NGS - Next generation sequencing DNA - deoxyribonucleic acid Mbp - Mega bases (1,000,000bp) PCR - polymerase chain reaction RFLP - restriction fragment length polymorphisms SNP - single nucleotide polymorphism aDNA - ancient DNA Summary The sequencing of the human genome in the early 2000s led to an increased interest in cheap and fast sequencing technologies. This interest culminated in the advent of next generation sequencing (NGS). A number of different NGS platforms have arisen since then all promising to do the same thing, i.e. produce large amounts of genetic information for relatively low costs compared to more traditional methods such as Sanger sequencing. The capabilities of NGS meant that researchers were no longer bound to species for which a lot of previous work had already been done (e.g. model organisms and humans) enabling a shift in research towards more novel and diverse species of interest. This capability has greatly benefitted many fields within the