Massively Parallel Sequencing of Human Mitochondrial Genome for Forensic Analysis
Total Page:16
File Type:pdf, Size:1020Kb
TALLINN UNIVERSITY OF TECHNOLOGY DOCTORAL THESIS 15/2020 Massively Parallel Sequencing of Human Mitochondrial Genome for Forensic Analysis MONIKA STOLJAROVA TALLINN UNIVERSITY OF TECHNOLOGY School of Science Department of Chemistry and Biotechnology This dissertation was accepted for the defence of the degree 19/05/2020 Supervisor: Dr. Anu Aaspõllu Department of Chemistry and Biotechnology Tallinn University of Technology Tallinn, Estonia Co-supervisor: Dr. Bruce Budowle Center for Human Identification University of North Texas Health Science Center Fort Worth, TX, USA Opponents: Prof Ángel Carracedo Álvarez Institute of Forensic Sciences University of Santiago de Compostela Santiago de Compostela, Galicia, Spain Prof Mait Metspalu Institute of Genomics University of Tartu Tartu, Estonia Defence of the thesis: 22/06/2020, Tallinn Declaration: Hereby I declare that this doctoral thesis, my original investigation and achievement, submitted for the doctoral degree at Tallinn University of Technology has not been submitted for doctoral or equivalent academic degree. Monika Stoljarova signature Copyright: Monika Stoljarova, 2020 ISSN 2585-6898 (publication) ISBN 978-9949-83-552-2 (publication) ISSN 2585-6901 (PDF) ISBN 978-9949-83-553-9 (PDF) TALLINNA TEHNIKAÜLIKOOL DOKTORITÖÖ 15/2020 Inimese mitokondriaalse genoomi massiivselt paralleelne sekveneerimine forensiliseks analüüsiks MONIKA STOLJAROVA Contents List of Publications ............................................................................................................ 6 Author’s Contribution to the Publications ........................................................................ 7 Introduction ...................................................................................................................... 8 Abbreviations .................................................................................................................... 9 1 Literature overview ...................................................................................................... 11 1.1 DNA typing for human identification ........................................................................ 11 1.2 Genetics of mitochondrial DNA................................................................................. 12 1.3 Mitochondrial DNA phylogenetics ............................................................................ 15 1.4 Mitochondrial DNA typing in forensic casework ....................................................... 18 1.5 Massively parallel sequencing for human identification .......................................... 19 2 Aims of the study ......................................................................................................... 21 3 Materials and methods ................................................................................................ 22 4 Results and discussion.................................................................................................. 24 4.1 The throughput level of MiSeq platform and Nextera XT DNA Preparation Kit based on read depth, strand bias and variant calls when sequencing mtGenomes (Publication I)……………………………………………………………………………………. ......................... 24 4.2 The comparison of discrimination power of mtGenome and HVI/HVII regions (Publications I and II)………………………………………………………………………………………………….. 28 4.3 Genetic description of an Estonian population sample (Publication II) ....................... 35 4.4 Mixture sample mtDNA analysis with MPS (Publication III) ...................................... 36 Conclusion ....................................................................................................................... 51 References ...................................................................................................................... 53 Acknowledgements ......................................................................................................... 66 Abstract ........................................................................................................................... 67 Lühikokkuvõte ................................................................................................................. 69 Appendix ......................................................................................................................... 71 Publication I……………………………………………………………………………………………………………….. 71 Publication II………………………………………………………………………………………………………………. 81 Publication III……………………………………………………………………………………………………………… 89 Curriculum vitae ............................................................................................................ 101 Elulookirjeldus ............................................................................................................... 104 5 List of Publications The list of author’s publications, on the basis of which the thesis has been prepared: I King, J. L., LaRue, B. L., Novroski, N. M., Stoljarova, M., Seo, S. B., Zeng, X., Warshauer, D. H., Davis, C. P., Parson, W., Sajantila, A., & Budowle, B. (2014). High-quality and high-throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq. Forensic Sci Int Genet, 12, 128-135. doi:10.1016/j.fsigen.2014.06.001 II Stoljarova, M., King, J. L., Takahashi, M., Aaspollu, A., & Budowle, B. (2016). Whole mitochondrial genome genetic diversity in an Estonian population sample. Int J Legal Med, 130(1), 67-71. doi:10.1007/s00414-015-1249-4 III Churchill, J. D., Stoljarova, M., King, J. L., & Budowle, B. (2018). Massively parallel sequencing-enabled mixture analysis of mitochondrial DNA samples. Int J Legal Med, 132(5), 1263-1272. doi:10.1007/s00414-018-1799-3 6 Author’s Contribution to the Publications Contribution to the papers in this thesis are: I Performance of the experimental work, analysis and interpretation of the data II Performance of the experimental work, analysis and interpretation of the data, preparation of the manuscript III Performance of the experimental work, analysis and interpretation of the data 7 Introduction In forensic identification mitochondrial DNA (mtDNA) is used for kinship analysis and in the cases of degraded/low quality DNA samples (e.g. teeth, hair, bones, touch traces). Although mtDNA has lower discrimination power compared to nuclear DNA short tandem repeat (STR) markers, lack of recombination and strictly maternal inheritance make mtDNA a valuable marker system for kinship analysis. In addition, high number of mtDNA copies per cell – on average 500 copies versus two copies of nuclear DNA – increases the success rate of the analysis of low level DNA samples. Based on commonly shared single nucleotide polymorphisms (SNPs) mtDNA sequences, i.e. haplotypes, are divided into genealogical groups (haplogroups) that share a common ancestor. The migration map of women 150,000 years before present from Africa onto different continents has been reconstructed using mtDNA analysis. Haplogroup assignment has been suggested to be used as a quality-control measure for mtDNA profiles. In the forensic context mtDNA typing of population samples is important for database establishment and estimating mtDNA haplotype frequencies. Mixtures - samples originating from two or more donors and are commonly recovered from touch, sexual and/or physical assault crime scenes - are one of the most challenging samples in forensic investigations. Mixture deconvolution via autosomal STR loci and the following statistical assessment of the results can be complicated (e.g., addressing allele overlap among an unknown number of contributors and allele drop out). The allele ratios of mtDNA mixture positions can be used as an alternative or adjunct tool for mixture interpretation. The gold standard for mtDNA analysis in forensic investigations is the Sanger-type sequencing (STS) of hypervariable regions I and II (HVI and HVII, respectively). Sequencing beyond HVI/HVII, that is approximately the other 96.3% of the mitochondrial genome (mtGenome), is rarely attempted as the technique is laborious, time consuming and expensive. However, it has been shown that discrimination power of the entire mtGenome data is much higher than that of HVI/HVII and can provide resolution of common HVI/HVII haplotypes. Moreover, it has been shown that STS has a number of limitations when mixture deconvolution is attempted. Massively Parallel Sequencing (MPS) provides substantial increase in throughput which makes it a desirable alternative to STS. MPS tracks the incorporation of each nucleotide of each molecule (or clone) as the DNA chain is elongated during primer extension. Therefore, interpretation of point as well as length heteroplasmy can be attempted. The sequencing of the entire mtGenome via MPS as well as the quantitative data generated can provide additional information for mixture interpretation. The aim of this thesis was, firstly, to evaluate the feasibility of an MPS system for generating entire mtGenome sequences. Secondly, to compare the discrimination power of mtGenome to HVI/HVII data in different population samples. Thirdly, the interpretation of mixture samples by using MPS generated mtDNA quantitative data, phasing information and phylogenetic assignment. 8 Abbreviations AFA African-American population aiSNP ancestry informative SNP ATP adenosine triphosphate bp base pair CAU Caucasian population CE capillary electrophoresis ChrX chromosome X ChrY chromosome Y CO cytochrome c oxidase CODIS Combined DNA Index System CR control region CRS Cambridge