Mitochondrial Genomes from Modern Horses Reveal the Major Haplogroups That Underwent Domestication
Total Page:16
File Type:pdf, Size:1020Kb
Mitochondrial genomes from modern horses reveal the major haplogroups that underwent domestication Alessandro Achillia,1, Anna Olivierib, Pedro Soaresc, Hovirag Lancionia, Baharak Hooshiar Kashanib, Ugo A. Peregob,d, Solomon G. Nergadzeb, Valeria Carossab, Marco Santagostinob, Stefano Capomaccioe, Michela Felicettie, Walid Al-Achkarf, M. Cecilia T. Penedog, Andrea Verini-Supplizie, Massoud Houshmandh, Scott R. Woodwardd, Ornella Seminob, Maurizio Silvestrellie, Elena Giulottob, Luísa Pereirac,i, Hans-Jürgen Bandeltj, and Antonio Torronib,1 aDipartimento di Biologia Cellulare e Ambientale, Università di Perugia, 06123 Perugia, Italy; bDipartimento di Biologia e Biotecnologie “L. Spallanzani”, Università di Pavia, 27100 Pavia, Italy; cInstituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP), 4200-465 Porto, Portugal; dSorenson Molecular Genealogy Foundation, Salt Lake City, UT 84115; eCentro di Studio del Cavallo Sportivo, Dipartimento di Patologia, Diagnostica e Clinica Veterinaria, Università di Perugia, 06123 Perugia, Italy; fDepartment of Molecular Biology and Biotechnology, Atomic Energy Commission, 6091 Damascus, Syria; gVeterinary Genetics Laboratory, University of California, Davis, CA 95616; hDepartment of Medical Genetics, National Institute for Genetic Engineering and Biotechnology (NIGEB), 14965/161 Tehran, Iran; iFaculdade de Medicina, Universidade do Porto, 4200-319 Porto, Portugal; and jDepartment of Mathematics, University of Hamburg, 20146 Hamburg, Germany Edited by Francisco Mauro Salzano, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil, and approved December 22, 2011 (received for review July 20, 2011) Archaeological and genetic evidence concerning the time and 16,351–16,660) within the control region (D-loop: np 15,469– mode of wild horse (Equus ferus) domestication is still debated. 16,660) (7, 10–13). In many other mammalian species, the vari- High levels of genetic diversity in horse mtDNA have been de- ation seen in this short segment of the mitochondrial genome is tected when analyzing the control region; recurrent mutations, often accompanied by high levels of recurrent mutations, thus however, tend to blur the structure of the phylogenetic tree. Here, blurring the structure of the tree and rendering the distinction we brought the horse mtDNA phylogeny to the highest level of between some important ancient branches within the tree virtu- molecular resolution by analyzing 83 mitochondrial genomes from ally impossible (14–16). To address this issue and improve modern horses across Asia, Europe, the Middle East, and the Amer- mtDNA phylogeny, we analyzed a total of 83 horse mitochondrial GENETICS icas. Our data reveal 18 major haplogroups (A–R) with radiation genomes (GenBank accession nos. JN398377–JN398457). times that are mostly confined to the Neolithic and later periods and place the root of the phylogeny corresponding to the Ancestral Results and Discussion Mare Mitogenome at ∼130–160 thousand years ago. All hap- Phylogeny of Horse Mitochondrial Genomes. For complete se- logroups were detected in modern horses from Asia, but F was quencing, mtDNAs were collected from a rather wide range of only found in E. przewalskii—the only remaining wild horse. There- horse breeds across Asia, Europe, the Middle East, and the fore, a wide range of matrilineal lineages from the extinct E. ferus Americas (SI Appendix,TableS1). Excluding ambiguous sites and underwent domestication in the Eurasian steppes during the Eneo- the 16129–16360 short tandem repeat (STR), we identified a total lithic period and were transmitted to modern E. caballus breeds. number of 667 varied sites (Table 1): 549 in the coding region ANTHROPOLOGY Importantly, now that the major horse haplogroups have been de- (15,476 nt) and 118 in the noncoding regions (1,032 nt), mostly fi ned, each with diagnostic mutational motifs (in both the coding represented by the D-loop (113 of 960 nucleotides). Overall, we i and control regions), these haplotypes could be easily used to ( ) observed an average number of 82.0 ± 30.7 nucleotide differences ii classify well-preserved ancient remains, ( ) (re)assess the hap- between two randomly chosen sequences. Fig. 1 illustrates the logroup variation of modern breeds, including Thoroughbreds, distribution of the total number of mutations along the genome iii and ( ) evaluate the possible role of mtDNA backgrounds in (continuous line). A similar plot was obtained when considering racehorse performance. the variation of nucleotide diversity (π)(17)alongtheentire mtDNA by assessing windows of 200 bp (step size = 100 bp) horse mitochondrial genome | mtDNA haplogroups | centered at the midpoint (Fig. 1, dotted line). As expected, the origin of Equus caballus | Przewalski’s horse | animal domestication highest diversity was observed around the HVS-I segment (from np 15,500 to np 15,800), with a peak of 0.02699. The latter value is in arming and animal domestication were fundamental steps in agreement with the values previously reported, which range from Fhuman development, contributing to the rise of larger settle- 0.03 to 0.05 (12). Within the coding region, the highest proportion ments and more stratified societies and eventually, great civili- of varied sites was found in protein-coding genes (Table 1) where, zations. One of the key events in this process was the domesti- consistent with previous reports on the human mitochondrial ge- cation of the wild horse (Equus ferus), which was profoundly nome (18–20), there was an excess of synonymous mutations. connected to numerous human activities in both prehistorical and historical times. The horse served as a means to provide food, facilitate transportation, and (since the Bronze age) enhance Author contributions: A.A., E.G., and A.T. designed research; A.A., A.O., H.L., B.H.K., S.G.N., warfare capabilities. A question that has not yet been completely V.C., and M. Santagostino performed research; A.A., A.O., S.C., M.F., W.A.-A., M.C.T.P., addressed concerns the timing and mode in which humans began A.V.-S., M.H., S.R.W., O.S., M. Silvestrelli, E.G., and A.T. contributed new reagents/analytic fi – tools; A.A., A.O., P.S., L.P., and H.-J.B. analyzed data; and A.A., A.O., P.S., U.A.P., L.P., to bene t from the employment of these animals (1 6). H.-J.B., and A.T. wrote the paper. From a genetic point of view, animal domestication can be The authors declare no conflict of interest. reconstructed by performing phylogeographic analyses of both This article is a PNAS Direct Submission. nuclear and mitochondrial genomic data (2, 7). A draft sequence Freely available online through the PNAS open access option. for the horse nuclear genome was recently published (8), whereas fi Data deposition: The sequences reported in this paper have been deposited in the Gen- the rst complete mtDNA sequence for this species has been Bank database (accession nos. JN398377–JN398457). available since 1994 (9). Despite this information, nearly all of the 1 To whom correspondence may be addressed. E-mail: [email protected] or molecular and evolutionary studies of horse mtDNA have gen- [email protected]. ∼ – erally focused on 350 650 bp from two HyperVariable Seg- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. ments (HVS-I: nucleotide position 15,469–15,834; HVS-II: np 1073/pnas.1111637109/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1111637109 PNAS Early Edition | 1of6 Downloaded by guest on September 26, 2021 Table 1. Distribution and recurrence of mutations in the 83 horse mtDNA sequences Noncoding Coding D-loop (excluding STR) Other rRNA tRNA mRNA Nonsynonymous Synonymous Length (bp) 960 72 2,556 1,522 11,400 8,522 2,845 Unvaried sites 847 67 2,496 1,496 10,937 8,434 2,469 Number of varied sites 113 5 60 26 463 88* 376* Proportion of varied sites 0.133 0.075 0.024 0.017 0.042 0.010 0.152 Sites with a single hit 63 3 59 25 412 77 335 Sites with two hits 18 2 1 1 42 7 36 Sites with three or more hits 32 0 0 0 9 4 5 Indels 6 1 5 1 0 0 0 Transitions 104 3 50 25 447 85 363 Transversions 3 1 5 0 16 3 13 Transition/transversion ratio 34.7 3.0 10.0 n.a. 27.9 28.3 27.9 *A unique transition at np 8,007 was counted two times: 8007(ATP8) and 8007(V15M ATP6). The evolutionary history of the 83 complete mtDNA sequen- Horse mtDNA Haplogroups. A total of 18 major haplogroups were ces (81 different haplotypes) was inferred by a parsimony ap- labeled in alphabetical order (from A to R), each defined by a fi proach (SI Appendix, Fig. S1). To translate the molecular di- speci c motif encompassing both the coding and control regions. vergence of the major tree nodes into time, the 83 complete The direct comparison of the new and old nomenclatures shows that all deep nodes in the horse phylogeny and most haplogroups mtDNA sequences were compared with the single available were not resolved by control region data alone (SI Appendix, complete mtDNA sequence (donkey reference sequence) from fi E. asinus Table S4). After these new haplogroups were de ned, we were a donkey ( ). A maximum likelihood (ML) tree was then able to compare the above-mentioned mutational patterns within estimated based on synonymous mutations alone (Fig. 2). De- and outside the established haplogroups (Table 2). A sign of puri- spite the number of potential factors causing time-dependent fying selection was evident when comparing the nonsynonymous/ rates (21), the synonymous molecular clock is most likely to re- synonymous ratio, which was significantly lower (P value < 0.001; main linear over time, although saturation can also be an issue Fisher exact test) in the deep portion of the tree.