Maximizing Mtdna Testing Potential with the Generation of High-Quality Mtgenome Reference Data Author(S): Rebecca S

The author(s) shown below used Federal funding provided by the U.S. Department of Justice to prepare the following resource: Document Title: Maximizing mtDNA Testing Potential with the Generation of High-Quality mtGenome Reference Data Author(s): Rebecca S. Just, Walther Parson, Jodi A. Irwin Document Number: 253077 Date Received: July 2019 Award Number: 2011-MU-MU-K402 This resource has not been published by the U.S. Department of Justice. This resource is being made publically available through the Office of Justice Programs’ National Criminal Justice Reference Service. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Final Technical Report to the National Institute of Justice on Award Number 2011-MU-MU-K402 Maximizing mtDNA Testing Potential with the Generation of High-Quality mtGenome Reference Data Authors: a,b c,d a,b,e Rebecca S. Just , Walther Parson and Jodi A. Irwin aArmed Forces DNA Identification Laboratory, 115 Purple Heart Dr., Dover AFB, DE, 19902 bAmerican Registry of Pathology, 9210 Corporate Blvd., Suite 120, Rockville, MD, 20850 cInstitute of Legal Medicine, Innsbruck Medical University, Mllerstrasse 44, Innsbruck, Austria dPenn State Eberly College of Science, 517 Thomas Building, University Park, PA, 16802 eFederal Bureau of Investigation, 2501 Investigation Parkway, Quantico, VA 22135 Recipient Organization: American Registry of Pathology 9210 Corporate Blvd., Suite 120 Rockville, MD 20850 Project Period: April 1, 2011 - March 31, 2015 This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Final Technical Report Award #2011-MU-MU-K402 Abstract Forensic mitochondrial DNA (mtDNA) testing requires high quality reference population data for estimating the rarity of questioned haplotypes and, in turn, the strength of the mtDNA evidence. Though novel methods that quickly and easily recover mtDNA coding region data are becoming increasingly available, the appropriate reference data and database tools required for their routine application in forensic casework have been lacking. The primary goal of this National Institute of Justice funded grant was to address these deficiencies by: 1) increasing the large-scale availability of high quality entire mitochondrial genome (mtGenome) reference population data, and 2) improving the information technology infrastructure required to access/search mtGenome data and employ them in forensic casework. To meet the first of these objectives, we developed a Sanger-based sequencing strategy which could be performed in high-throughput fashion on robotic instrumentation. Using this strategy and an intensive, multi-step data review process, we produced 588 full mitochondrial genome haplotypes, from anonymized, randomly sampled blood serum specimens representing three U.S. population groups (African Americans, U.S. Caucasians and U.S. Hispanics), that meet all current forensic data quality standards. Nearly complete resolution of the haplotypes was achieved with full mtGenome sequences for all three populations, and comparisons to published control region (CR) datasets demonstrated that the databases we have developed are as representative as the reference data on which haplotype frequency estimates presently rely. To achieve the second objective, we modified the existing structure of the European DNA Profiling Group mtDNA Population database (EMPOP) to both store and query full mtGenome reference data. In addition, we further improved the utility of the database for forensic applications by the addition of a number of new features. These additions include software that performs 2 This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Final Technical Report Award #2011-MU-MU-K402 automated mtDNA haplogroup estimations for both full and partial mtGenome sequences, updated population structure schemes for all mtDNA data currently housed in the database, and various tools that permit both searches and visualization of the geographic distribution of mtDNA haplogroups, sequences and individual sequence variants. The full mtGenome population reference data we have produced, and the tools for their storage and use that we have developed in the new version of EMPOP (EMPOP3), will provide a solid foundation for the generation of complete mtGenome haplotype frequency estimates for forensic applications. The thoroughly vetted data can serve as a standard against which the quality and features of future mtGenome datasets (especially those developed via next generation sequencing) may be evaluated, and the extensive data review methods we applied can be used as a model for future mtGenome databasing initiatives. Our successful use of a semi-automated processing strategy on forensic-like samples provides laboratories with practical insight into the feasibility of producing complete mtGenome data in a routine casework environment, and our detailed empirical data on amplification success rates across a range of DNA input quantities will also be useful moving forward as PCR-based strategies for mtDNA enrichment are considered for targeted next generation sequencing workflows. Modifications to the EMPOP database will not only permit the query of full mtGenome datasets to assess haplotype frequencies, but will also provide substantially enhanced functionality for analyses of both complete and partial mtDNA sequences. Overall, the products we have delivered will provide the essential groundwork for expanded use of mtDNA typing in both the short and long term, as next generation sequencing methods begin to be employed in forensic casework practice. 3 This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Final Technical Report Award #2011-MU-MU-K402 Acknowledgements A number of key scientists made enormous contributions of both time and expertise to this grant project, and their dedication and exceptional quality of work was critical to its successful completion. We are grateful to Melissa Scheible, Kimberly Andreaggi, Spence Fast, Alexander Rck, Elizabeth Lyons, Jocelyn Bush, Jennifer Higginbotham, Michelle Peck, Joseph Ring, and Gabriela Huber. We also thank Catarina Xavier, Christina Strobl, Toni Diegoli, Martin Bodner, Liane Fendt, Petra Kralj, Simone Nagl, Daniela Niederwieser, and Bettina Zimmerman for contributing their time and considerable skill. We also wish to recognize the contributions of a long list of scientific colleagues, and technical and administrative personnel across multiple institutions who have been involved in or have supported the work in various ways. These individuals include Lara Adams, Thomas Callaghan, James Canik, Lanelle Chisholm, Richard Coughlin, Michael Cummings, Arne Dr, COL Louis Finelli, Constance Fisher, Shairose Lalani, Odile Loreille, Charla Marshall, Timothy McMahon, Jon Norris, Anthony Onorato, Eric Pokorak, CAPT Edward Reedy, Lt Col Laura Regan, James Ross, Richard Scheithauer and Cynthia Thomas. Finally, we wish to express our sincere gratitude to our NIJ program managers, Minh Nguyen and Chad Ernst, who have been incredibly supportive and helpful throughout the project. 4 This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. Final Technical Report Award #2011-MU-MU-K402 Table of Contents Abstract............................................................................................................................................2 Acknowledgements..........................................................................................................................4 Executive Summary.........................................................................................................................6 Main Body .....................................................................................................................................18 I. Introduction........................................................................................................................18 II. Materials and Methods.......................................................................................................23 MtGenome sequencing protocol development .......................................................23 EMMA development ...............................................................................................29 Specimens and sampling ........................................................................................31 MtGenome data generation ...................................................................................31 MtGenome data review ..........................................................................................35 MtGenome data analyses .......................................................................................39

Maximizing Mtdna Testing Potential with the Generation of High-Quality Mtgenome Reference Data Author(S): Rebecca S

Mitochondrial Haplogroup Background May Influence

Different Matrilineal Contributions to Genetic Structure of Ethnic Groups in the Silk Road Region in China

Analysis of Y-Chromosome Polymorphisms in Pakistani

Haplogroup Relationships Between Domestic and Wild Sheep Resolved Using a Mitogenome Panel

The Genetic History of the Otomi in the Central Mexican Valley

Phylogenetic Resolution with Mtdna D-Loop Vs. HVS 1: Methodological Approaches in Anthropological Genetics Utilizing Four Siberian Populations

Mtdna Evidence: Genetic Background Associated with Related Populations at High Risk for Esophageal Cancer Between Chaoshan and T

A War-Prone Tribe Migrated out of Africa to Populate the World

Cultural Variation Impacts Paternal and Maternal Genetic Lineages of the Hmong-Mien and Sino-Tibetan Groups from Thailand

Ancient DNA Evidence of Population Affinities Casey C. Bennett

Haplogroup a Sample Report

DNA Analysis of an Early Modern Human from Tianyuan Cave, China