B1 Insertions As Easy Markers for Mouse Population Studies
Total Page:16
File Type:pdf, Size:1020Kb
MammalianIncotporathtg Mouse Genome Original Contributions Genome B1 insertions as easy markers for mouse population studies Pavel Munclinger, 1 Pierre Boursot, 2 Barbara Dod2 'Biodiversity Research Group, Department of Zoology, Faculty of Science, Charles University, Vinicna 7, CZ 128 44 Praha 2, Czech Republic 2Laboratoire Genome, Populations, Interactions, Universite Montpellier II, F-34095 Montpellier, France Received: 22 November 2002 / Accepted: 5 February 2003 there are still surprisingly few easy-to-score markers Abstract that can be used in population genetics studies or to Few simple, easy-to-score PCR markers are available differentiate among its different subspecies. Because for studying genetic variation in wild mice popula- of their high degree of polymorphism and the ho- tions belonging to Mus musculus at the population moplasy (alleles of the same size resulting from in- and subspecific levels. In this study, we show the dependent mutation events) associated with their abundant B1 family of short interspersed DNA ele- mode of evolution, microsatellite loci are difficult ments (SINEs) is a very promising source of such markers to use in such studies and there is still rel- markers. Thirteen B1 sequences from different re- atively little comparative sequence data for the dif- gions of the genome were retrieved on the basis of ferent subspecies of M. musculus that can be used to their high degree of homology to a mouse consensus design SNP markers. Another potential source of sequence, and the presence of these elements was markers are the orthologous SINEs, which have a screened for in wild derived mice representing M. very low probability of being inserted more than spretus, macedonicus and spicilegus and the differ- once in the same site. At the species level, young ent subspecies of M. musculus. At five of these loci, unfixed SINE loci have proved to be very useful for varying degrees of insertion polymorphism were studying population structure and intraspecific de- found in M. m. domesticus mice. These insertions mography (Shedlock and Okada 2000). were almost totally absent in the mice representing The youngest members of the Alu subfamilies the other subspecies and species. Six other B1 ele- that are currently active in the human genome have ments were fixed in all the Mus species tested. At been shown to be a good source of markers for hu- these loci, polymorphism associated with three re- man population genetics (Batzer et al. 1996), and striction sites in the B1 consensus sequence was systematic genomic database mining (Roy-Engel found in M. musculus. Most of these polymorphisms et al. 2001) has been used successfully to identify the appear to be ancestral as they are shared by at least candidate Alu elements in the genomic sequences one of the other Mus species tested. Both insertion provided by the Human Genome Project. The B1 and restriction polymorphism revealed differences repeat family is the mouse equivalent of the Alu between five inbred laboratory strains considered to family, and here we assess the utility of using a be of mainly domesticus origin, and at the six re- similar strategy to find insertion polymorphisms of striction loci a surprising number of these strains Bl elements within M. musculus that can be used as carried restriction variants that were either not intra or intersubspecific markers. found or very infrequent in domesticus. This sug- The B1 SINE family in the mouse consists of gests that in this particular group of loci, alleles of more than 100,000 copies that have accumulated by far Eastern origin are more frequent than expected. retroposition throughout the course of rodent evo- lution (Rogers 1985). Like the human Alu repeats, they are derived from 7SL RNA (Ullu and Tschudi Although the house mouse (Mus musculus) is one of 1984) and are copies of a limited number of active the most important mammalian laboratory models, master or source copies that have inserted into the genome by retroposition. Quentin (1989) showed Correspondence to: P. Munclinger ; E-mail: muncling@ that throughout the evolution of the rodent lineages, natur.cuni.cz B1 elements have been retroposed from a succession DOI: 10.1007/s00335-002-3065-7 • Volume 14, 359-366 (2003) • © Springer-Verlag New York, Inc. 2003 359 360 P. MUNCLINCER ET AL: B1 INSERTIONS IN MOUSE POPULATIONS of variant progenitor sequences that have left sub- ©M Co — Ci 1010 GN Co O N 10 \0 — N families of related elements in the mouse genome. rommmCoCoCO nmrorocoCo The subfamily generated by a given progenitor copy b C U H U can be identified by the presence of shared variant HHH nucleotide position(s) relative to the Bl consensus C sequence. Kass et al. (2000) provide evidence that the U0U HC7 HHC7 UHH HC7UU^ progenitors of the four most recent subfamilies HH(^^E C7UUUU UHHHH found in M. musculus, known as A, B, C, and D, UUU<HU U<4 0UU have been active concomitantly in the lineage of the D UU HUEH H L U genus Mus that gave rise to M. musculus, M. spretus, C ^H^C7HC7^C7UC7UU0 M. spicilegus, and M. macedonicus. UHU-eo ^UHUC7^L U C7UU^HUUC7^ C7^^ Our knowledge of the dynamics of B1 insertion HHUL in M. musculus is not as well documented as the Alu C family in humans. Kass et al. (2000) showed that U-<UHUC7UUL HC7^ polymorphic B1 insertions associated with recent y ^ integrations exist in M. musculus, but there is no C U information about how widespread they are in the H^ U Q mouse genome. In this study, we undertook a pop- C H UH U H H r U ^^ UHUUUHUUHU UU ulation survey involving 250 wild mice belonging to C H H EHHC^U ^ U C7UE" the different M. musculus subspecies, to investigate UU^U UUUoUUU0U the degree of insertion polymorphism associated UU^H^EH HUCU' 4.^ UU^UH^UUHHHU with 13 B1 insertions of presumably recent origin C UHU^UUU<HUUU C ^H that show a high degree of homology to the con- E -^UUHUHE ^ U U FU^E C7 E .HUHH UU^H sensus sequence. C UHHUU UUUUUHUU HUHU C U ^ c"s il UHHUUEH-^U^<^UC7 UUUUUUUUUHU H Materials and methods HHU0U7 UHHU^ Mice. The reference panel of mouse DNA, isolated 00 N ^ C\ Co N- ' t Co N Ln O o, O\ ON 04 N Co Co 010 G\ ON Vr G1 Co 1010 N from wild derived strains or wild mice representing a 0C\I^'NLno'ONNC\C m r- N N n r^ Co \C n substantial part of the M. musculus species range, ^ o . ,y ^ w ter, came from the mouse DNA collections of the au- ^ o thors' laboratories in Montpellier and Prague. Most -^ co Co — • Co N CoN Co N N N Co 'C - N N' of the individuals represent the M. m. domesticus N-m^^cmO"N \ and musculus populations, but a few wild derived o ^ N strains representing M. m. castaneus (Taiwan, o\ mmC\ v Co ^m^Co^0 10 0\ rC\O1n Thailand), M. m. molossinus, M. spretus (France and cLO' 0"10moo m o'tOZo CNN . CCN0LnM0, Spain), M. macedonicus (Bulgaria, Turkey, and Syr- U U o ,or.,-,C N -rooc\ao ©" o C7^ ia), and M. spicilegus (Moldavia, Austria, and Uk- -d raine) were also included. The detailed list of 0 individuals is available on request. C o^ aaa^a,a ac aaaa m "M Is 1a Ct B1 insertions. Three hundred and ninety seven U ^ sequences showing significant alignments with a cl consensus B1 sequence (Quentin 1989) were U "Pa"P4aa11 Ic tt M M to M found in a Blast search (Altschul et al. 1997) of o ° C^ 7606 gene-containing genomic sequences (a total a.^ C m 10 1n 0 - of 2.8 Mb) extracted from GenBank. Thirteen u ¢© C N- C N- N C\ - C — CoN - highly conserved B1 elements present in or close to O N N ' d' N - N- Lr) mapped genes or pseudogenes were chosen from among these sequences. Their accession numbers and position in the sequence are summarized in r Table 1. For the sake of simplicity, these PCR loci O ^ A ti are referred to by the name of the nearest gene or ^^°gtipseudogene. IÐH Zwh^ P. MUNCLINGER ET AL: BI INSERTIONS IN MOUSE POPULATIONS 361 Primers and PCR conditions. Table 1 shows the Search for RFLPs in the fixed Bl elements. In PCR primers flanking these B1 elements sequences, general, the CpG dinucleotides present in the Bl which were defined by using the program PRIMERS elements have a higher tendency to show changes (Rozen and Skaletsky 2000). PCR reactions (10 µL) relative to the consensus sequence than the other were performed in the presence of 2 mivt MgC1 2 and positions (Quentin 1989), and this holds for the 0.25 units of Taq polymerise. After an initial incu- conserved sequences we analyzed in this study (Fig. bation at 95°C for 3 min, the reactions were ampli- 1). Restriction enzymes with sites that include these fied for 35 cycles at 95°C for 45 s, 60°C for 45 s, and dinucleotides are more likely to reveal polymor- 72°C for 45 s. The last step was a final extension at phisms than enzymes with sites in the more con- 72°C for 10 min. The PCR products were run on 2% served regions of the B1 sequence. We therefore used agarose gels, and the presence or absence of the given Mspl, which has one site, and TaqI, which has two B1 element was deduced from the size of the am- sites [TaqI( 1) and Tagl(2)] in the consensus sequence plified DNA product.