ESCMID Online Lecture Library © by Author
Total Page:16
File Type:pdf, Size:1020Kb
Max-Planck-Institut für Menschheitsgeschichte Max Planck Institute for the Science of Human History Ancient DNA – methods, quality and acceptance criteria Alexander Herbig Department of Archaeogenetics Max Planck Institute for ©the Scienceby author of Human History, Jena, Germany 03.06.2016 ESCMID Online Lecture Library Application of Molecular Diagnostics in Forensic Microbiology 2 - 3 June 2016, Leuven, Belgium Why ancient DNA? Historical Questions / Evolutionary Questions © by author ESCMID Online Lecture Library Denisovans First Hominin discovered by Geneticists Viel Spaß! 350K © 200Kby author ESCMID Online Lecture Library Modern Human Neandertal Denisovan Chimpanzee Krause et al., 2010 Ancient Pathogen Genome Evolution Medieval Yersinia pestis Pre-Columbian Mycobacterium tuberculosis (Bos et al., 2016) (Bos et al., 2014) © by author ESCMID Online Lecture Library Mycobacterium leprae Medieval Helicobacter pylori from the Iceman (Schuenemann et al., 2013) (Maixner et al., 2016) Ancient DNA Properties and Challenges Low amount of endogenous DNA DNA highly fragmented Contamination by humans (clean room facilities) Contamination by environmental DNA © by author ESCMID Online Lecture Library Targeted Enrichment Plant Plant Fraction of target organism's DNA is low Bacteria Enrichment of target DNA using Bacteria Array capture Array design has to account for Human expected diversity Fungal Pathogen© by author Target organism might be only distantly related to known ESCMIDTargeted DNA Online Enrichment Lectureorganisms Library Targeted Enrichment Capture Arrays Target molecules Array hybridisation Target molecules Stoneking and Krause, 2011 Collection of genomes of appropriate reference organisms © by author Whole-genome alignment ESCMID Identification of conserved Online and Lecturevariable regions Library Capture Arrays Probe Design © by author ESCMID Online Lecture Library Capture Arrays Challenges More probes for variable regions needed Maximal number of probes limited (more than one array can be used) Melting temperature and other factors have to be taken into account © by author Repetitive regions have to be excluded ESCMID Online Lecture Library Next Generation Sequencing Data Processing Related Bacteria Pathogen DNA from related or unrelated bacteria are still contained © by author Stricter filtering criteria have to be used to avoid processing of data from related species ESCMID Online Lecture Library But we do not want to be to strict. Where is the sweet spot? Data Processing Pipeline © by author ESCMID Online Lecture Library Data Processing Pipeline Available Software EAGER: efficient ancient genome reconstruction Peltzer et al. 2016, Genome Biology http://it.inf.uni-tuebingen.de Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX Schubert et al. 2014, Nature Protocols© by author http://geogenetics.ku.dk/publications/paleomix ESCMID Online Lecture Library Data Processing Read Merging Sequencing Reads DNA Fragment Overlapping reads pairs due to short DNA fragments No standard paired-end mapping possible Overlap can be used to© compensate by author for sequencing errors ESCMID Merging of overlapping Online read pairs Lecture Library Data Processing Read Merging Construction of consensus© by sequence author in overlapping region Selection of nucleotides with higher sequencing quality ESCMID Results in merged Online reads with Lecture effectively higher Library sequencing quality in overlapping region Data Processing De novo Assembly Sequencing Reads Sequencing Reads Contig Contig Determine overlap between© readsby andauthor combine them to create contigs Gaps occur due to low coverage ESCMID Online Lecture Library No scaffolding possible (no 'real paired-end reads') De novo Assembly Ancient Leprosy De novo assembly of ancient M. leprae strain resulted in 169 contigs Coverage of more than 97% of the reference genome Most repetitive regions could © by authornot be resolved Metagenomic assembly consists of 2354 contigs ESCMID Online Lecture Library Schuenemann et al., 2013 Phylogenetic Analyses Ancient Tuberculosis ~1000 years old M. tuberculosis genomes from Peru Most closely related to strains found in seals and sea lions © by author ESCMID Online Lecture Library Bos et al., 2014 Early History of Diseases Bonze Age Plague Y. pestis genomes from Bronze Age remains Ancestral to all forms identified so far Insights into the evolution of Y. pestis virulence and © by author the early history of plague ESCMID Online Lecture Library Rasmussen et al., 2015 The Human Microbiome © by author ESCMID Online Lecture Library Ottman et al., 2012 The Human Microbiome Co-Evolution © by author ESCMID Online Lecture Library MALT MALT - MEGAN ALignment Tool (Herbig et al., under review) http://ab.inf.uni-tuebingen.de/software/malt Preprocessed Reads adapter clipping etc. Spaced Seeds compensating for mismatches,© by high authorsensitivity Hash Index fast seed matching ESCMIDPrecise Alignment Online Lecture Library for in-depth downstream analyses Taxonomic Binning for metagenomic analysis The MALT Pipeline © by author ESCMID Online Lecture Library Integration with the interactive metagenomics analysis software MEGAN (Huson et al. 2011) MEGAN © by author ESCMID Online Lecture Library MALT Runtime Performance Comparison to BLAST (Altschul et al. 1990) and lambda (Hauswedell et al. 2014) © by author Runtime [s] Runtime 82% sensitivity ESCMID Online Lecture Libraryat significance level 10-5 (E-value) Number of Reads The Tyrolean Iceman • 5,300 year-old copper age glacier mummy • Early farmer • Various well preserved soft tissues Two Tissue Samples © by author • Gingiva tissue (oral cavity) ESCMID• Lung tissue Online Lecture Library The Tyrolean Iceman Oral Cavity vs. Lung © by author ESCMID Online Lecture Library Oral Cavity Lung The Tyrolean Iceman Oral Cavity vs. Lung © by author ESCMID Online Lecture Library The Tyrolean Iceman The Iceman’s Original Microbiomes Oral Cavity Lung Total Reads 12,486,937 8,503,240 Assigned Reads 1,098,791 1,867,705 Streptococcus 5,551 55 Streptococcus mutans 159 0 Staphylococcus 4,109 27 Staphylococcus aureus 341 0 Treponema denticola 78 0 Filifactor alocis © by author561 0 Fusobacterium nucleatum 1,941 0 Lactococcus lactis 843 0 Haemophilus 224 66 ESCMIDKlebsiella pn eOnlineumoniae Lecture172 Library375 The Tyrolean Iceman The Iceman’s Original Microbiomes Streptococcus mutans Tooth decay Filifactor alocis Periodontal disease Fusobacterium nucleatum Peridontal plaque © by author Haemophilus influenzae Bacterial meningitis ESCMID Online Lecture Library Klebsiella pneumoniae Pneumonia The Tyrolean Iceman H. pylori from the Stomach Reconstructed H. pylori genome from the Iceman’s stomach Comparative analysis with contemporary strains Original European population? © by author More recent African admixture? ESCMID Online Lecture Library Maixner et al., 2016 Authentication Is the recovered DNA of ancient Origin? Can we differentiate Species? Influence of Contamination? Multiple© by Infections?author ESCMID Online Lecture Library Authentication Criteria • Ancient DNA? • DNA Damage patterns © by author ESCMID Online Lecture Library Authentication DNA Damage Patterns Deamination: Cytosine Uracil Thymine © by author mapDamage 2.0 (Jónsson et al. 2013) ESCMID OnlinePosition from 5’ Lecture end Library CT 5’ Authentication DNA Damage Patterns © by author ESCMID Online Lecture Library Authentication • Ancient DNA? • DNA Damage patterns • Correct species? • Evenness of target sequence coverage © by author ESCMID Online Lecture Library Cross-Mapping Evenness of Coverage Distribution of reads across the reference genome even © by author unevenESCMID Online Lecture Library Authentication • Ancient DNA? • DNA Damage patterns • Correct species? • Evenness of target sequence coverage • Multiple species? • Distributions of ©similarity by scores author ESCMID Online Lecture Library Cross-Mapping Similarity Distributions Distributions of %identity values for aligned reads Positive Negative Background Foreground © by author ESCMID Online Lecture Library Authentication • Ancient DNA? • DNA Damage patterns • Correct species? • Evenness of target sequence coverage • Multiple species? • Distributions of ©similarity by scores author ESCMID• Multiple strains? Online Lecture Library • Distributions of allele frequencies Cross-Mapping Allele Frequencies Only one strain present Multiple strains present in equal/different proportions© by author ESCMID Online Lecture Library Cross-Mapping Allele Frequencies Only one strain present ...with cross-mapping from other related bacteria © by author ESCMID Online Lecture Library Cross-Mapping Allele Frequencies Default parameters © by author Stricter mapping and ESCMIDfiltering Online Lecture Library Bos et al. 2014 Chan et al. 2013 Authentication Criteria • Ancient DNA? • DNA Damage patterns • Correct species? • Evenness of target sequence coverage • Multiple species? • Distributions of ©similarity by scores author ESCMID• Multiple strains? Online Lecture Library • Distributions of allele frequencies Summary Steps to ancient DNA retrieval and analysis • Sampling and DNA extraction (Clean Room) • Screening • Targeted (e.g. PCR-based) • Non-Targeted (e.g. MALT) • Authentication • Genome Capture © by author • Probe Design • Sequencing and Data Analysis ESCMID• Comparative OnlinePhylogenomics Lecture Library • Virulence Factors • … Acknowledgements The Archaeogenetics Department of the MPI for SHH, Jena Johannes Krause Åshild Vågene Kirsten Bos Michal Feldman Verena Schuenemann Maria Spyrou Aida Andrades Valtueña James Fellows Yates Aditya Kumar Lankapalli © by authorFlorian Aldehoff ESCMID Online Lecture Library Daniel Huson Frank Maixner Benjamin Buchfink Albert Zink.