Interspecific Introgressive Origin of Genomic Diversity in the House
Total Page:16
File Type:pdf, Size:1020Kb
Interspecific Introgressive Origin of Genomic Diversity in the House Mouse Kevin J. Liu ∗, Ethan Steinberg y, Alexander Yozzo y , Ying Song z, Michael H. Kohn z , and Luay Nakhleh y z ∗Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824 (Research was performed while KJL was at Rice University),yDepartment of Computer Science, Rice University, Houston, TX 77005, and zBiosciences, Rice University, Houston, TX 77005 Submitted to Proceedings of the National Academy of Sciences of the United States of America We report on a genome-wide scan for introgression in the house trogressive descent origins of the mouse genome were hidden mouse (Mus musculus domesticus) involving the Algerian mouse due to data post-processing that masked introgressed genomic (Mus spretus), using samples from the ranges of sympatry and al- regions as missing data. In another study reporting whole- lopatry in Africa and Europe. Our analysis reveals wide variability genome sequencing of 17 classical lab strains [5], M. spretus in introgression signatures along the genomes, as well as across the was used as an outgroup for phylogenetic analysis. The au- samples. We find that fewer than half of the autosomes in each genome harbor all detectable introgression, while the X chromosome thors were surprised to find that 12.1% of loci failed to place has none. Further, European mice carry more M. spretus alleles than M. spretus as an outgroup to the M. musculus clade. The au- the sympatric African ones. Using the length distribution and shar- thors concluded that M. spretus was not a reliable outgroup, ing patterns of introgressed genomic tracts across the samples, we but did not pursue their observation further. On the other infer, first, that at least three distinct hybridization events involving hand, in a 2002 study [8], Orth et al. compiled data on al- M. spretus have occurred, one of which is ancient, and the other two lozyme, microsatellite, and mitochondrial variation in house are recent (one presumably due to warfarin rodenticide selection). mice from Spain (sympatry) and nearby countries in Western Second, several of the inferred introgressed tracts contain genes that and Central Europe. Interestingly then, allele sharing between are likely to confer adaptive advantage. Third, introgressed tracts the species was observed in the range of sympatry but not so might contain driver genes that determine the evolutionary fate of those tracts. Further, functional analysis revealed introgressed genes outside in the range of allopatry. The studies demonstrated that are essential to fitness, including the Vkorc1 gene, which is im- the possibility of natural hybridization between these two sis- plicated in rodenticide resistance, and olfactory receptor genes. Our ter species. Further, the study of Song et al. [9] demonstrated findings highlight the extent and role of introgression in nature, and a recent adaptive introgression from M. spretus into some M. call for careful analysis and interpretation of house mouse data in m. domesticus populations in the wild, involving the vitamin evolutionary and genetic studies. K epoxide reductase subcomponent 1 (Vkorc1) gene, which was later shown to be more widespread in Europe albeit ge- Mus musculs j Mus spretus j hybridization j adaptive introgression ographically restricted to parts of southwestern and central j PhyloNet-HMM Europe [10]. Major, unanswered questions arise from these studies. First, is the vicinity around the Vkorc1 gene an isolated case of Significance Statement adaptive introgression in the house mouse genome, or do many The mouse has been one of the main mammalian model organ- other such regions exist? Second, is introgression from M. isms used for genetic and biomedical research. Understanding spretus into M. m. domesticus common outside the range of the evolution of house mouse genomes would shed light not sympatry? Third, have there been other hybridization events, only on genetic interactions and their interplay with traits in and in particular, more ancient ones? Fourth, what role do the mouse, but would also have significant implications for introgressed genes, and more generally, genomic regions, play? human genetics and health. Analysis using a recently devel- To investigate these open questions, we used genome-wide oped statistical method shows that the house mouse genome is variation data from 20 M. m. domesticus samples (wild and a mosaic that contains previously unrecognized contributions wild-derived) from the ranges of sympatry and allopatry, as from a different mouse species. We traced these contributions well as two M. spretus samples. For detecting introgression, to ancient and recent interbreeding events. Our findings re- we used PhyloNet-HMM [11], a newly developed method for veal the extent of introgression in an important mammalian statistical inference of introgression in genomes while account- genome, and provide an approach for genome-wide scans of ing for other evolutionary processes, most notably incomplete arXiv:1311.5652v2 [q-bio.PE] 19 Sep 2014 introgression in other eukaryotic genomes. lineage sorting (ILS). Our analysis provides answers to the questions posed above. First, we find signatures of introgression in each of lassical laboratory mouse strains, as well as newly estab- the M. m. domesticus samples. The amount of introgression lished wild ones are widely used by geneticists for an- C varies across the autosomes of each genome, with a few chro- swering a diverse array of questions [1]. Understanding the genome contents and architecture of these strains is very im- portant for studies of natural variation and complex traits, as well as evolutionary studies in general [2]. Mus spretus, Reserved for Publication Footnotes a sister species of M. musculus, impacts the findings in M. musculus investigations for at least two reasons. First, it was deliberately interbred with laboratory M. musculus strains to introduce genetic variation [3]. Second, M. m. domesticus is partially sympatric (naturally co-occurring) with M. spretus (Fig. 1). Recent studies have examined admixture between sub- species of house mice [4, 5, 6, 7], but have not studied in- trogression with M. spretus. In at least one case [4], the in- www.pnas.org | | PNAS Issue Date Volume Issue Number 1{6 mosome harboring all detectable introgression, and most of amount of sharing of introgressed regions. Further, Group I the genomes have none. We detected no introgression on the has the most introgression and Group III has the least. No- X chromosome. Further, the amount of introgression varied tice that all samples within Group I, except for the one from widely across the samples. Our analyses demonstrate intro- Spain-Arenal, contain the introgression from M. spretus that gression outside the range of sympatry. In fact, our results carries Vkorc1 [9]. Group II contains all the allopatric Euro- show more signatures of introgression in the genomes of sam- pean mice that do not carry Vkorc1, and Group III contains ples from Europe than in those of samples from Africa. For the all the sympatric African samples. This categorization guides third question, we utilized the length distribution and sharing our results displays and analyses below. These results answer patterns of introgressed regions across the samples to show the first two questions we posed above in the affirmative: there support for at least three hybridization events: an ancient hy- are introgressed genomic regions beyond the region that con- bridization event that predates the colonization of Europe by tains Vkorc1, and introgression is present in all 20 samples, M. m. domesticus and two more recent events, one of which pointing to potential hybridization outside the range of sym- presumably occurred about 50 years ago and is related to war- patry. Quantifying how common such hybridization is outside farin resistance selection [9]. For the fourth question, our func- the range of sympatry, though, requires denser sampling that tional analysis of the introgressed genes shows enrichment for is beyond the scope of this work. certain categories, most notably olfaction|an essential trait for the fitness of rodents. Understanding the genomic archi- Support for more than a single hybridization event. tecture and evolutionary history of the house mouse has broad To answer the third question of whether multiple, distinct hy- implications on various aspects of evolutionary, genetic, and bridization events involving M. spretus and M. m. domesticus biomedical research endeavors that use this model organism. have occurred, we focused on two analyses: Inspecting the The PhyloNet-HMM method [11] can be used to detect intro- introgressed tract length distribution, where an introgressed gression in other eukaryotic species, further broadening the tract is defined as a maximally contiguous introgressed re- impact of this work. gion, and inspecting the sharing pattern of introgressed re- gions across the samples. Repeated back-crossing, recombina- tion, and drift result in fragmentation of introgressed regions, Results with very long regions pointing to recent hybridization events. We now describe our findings of introgression within the indi- On the other hand, selection on an adaptive introgressed re- vidual genomes, as well as across the genomes of the 20 M. m. gions could also maintain it for long periods, confounding the domesticus samples (40 haploid genomes). The four African tract length-based analysis of the age of hybridization. How- samples as well as the two samples from Spain are sympatric ever, if a long region is shared across some, but not all, samples with M. spretus, whereas the other samples are allopatric (Fig. from the population, that increases the likelihood of a recent 1). hybridization hypothesis. For each of the three groups, we plotted the distribution of introgressed tract lengths, where an introgressed tract is de- Genome-wide signals of introgression. Our analysis de- fined as a maximally contiguous introgressed region (Fig. 3). tected introgression in the genomes of all 20 M. m. domes- The figure shows that Group I contains the only samples that ticus samples; Fig.