When Genetic Distance Matters: Measuring Genetic Differentiation at Microsatellite Loci in Whole- Genome Scans of Recent and Incipient Mosquito Species
Total Page:16
File Type:pdf, Size:1020Kb
When genetic distance matters: Measuring genetic differentiation at microsatellite loci in whole- genome scans of recent and incipient mosquito species Rui Wang*, Liangbiao Zheng†, Yeya T. Toure´ ‡, Thomas Dandekar*§, and Fotis C. Kafatos*§ *European Molecular Biology Laboratory, Meyerhofstrasse 1, 69012 Heidelberg, Germany; †Yale University School of Medicine, Epidemiology and Public Health, 60 College Street, New Haven, CT 06520; and ‡Universite´du Mali, Faculte´deMe´de´ cine, de Pharmacie et d’Odonto-Stomatologie, B. P. 1805, Bamako, Mali Contributed by Fotis C. Kafatos, January 2, 2001 Genetic distance measurements are an important tool to differen- somal differentiation. This observation recently led to the definition tiate field populations of disease vectors such as the mosquito of ‘‘molecular forms M and S’’ (1) or ‘‘molecular types I and II’’ (2), vectors of malaria. Here, we have measured the genetic differen- on the basis of fixed differences in the intergenic spacer or internal tiation between Anopheles arabiensis and Anopheles gambiae,as transcribed spacer rDNA regions, respectively. Because the repet- well as between proposed emerging species of the latter taxon, in itive nature of rDNA raises doubt as to its reliability as a marker of whole genome scans by using 23–25 microsatellite loci. In doing so, incipient speciation processes, much interest is now focused on we have reviewed and evaluated the advantages and disadvan- possible new evidence of genetic distinctness between the forms͞ 2 tages of standard parameters of genetic distance, FST, RST,(␦) , types. -and D. Further, we have introduced new parameters, D and DK, Among molecular genetic markers, highly polymorphic mic which have well defined statistical significance tests and comple- rosatellites have been used extensively for population studies in ment the standard parameters to advantage. D is a modification humans (8), mammals (9), fruit flies (10), and anopheline of D, whereas DK is a measure of covariance based on Pearson’s mosquitoes (11–13). Various statistical models have been pro- correlation coefficient. We find that A. gambiae and A. arabiensis posed for evaluating genetic differentiation (14–17), but addi- are closely related at most autosomal loci but appear to be tional theoretical and empirical comparisons regarding their distantly related on the basis of X-linked chromosomal loci within efficacy would be helpful. For microsatellites, FST and D (14) are the chromosomal Xag inversion. The M and S molecular forms of closely tied to the infinite allele model of mutation (IAM), where A. gambiae are practically indistinguishable but differ significantly each mutation can produce an allele of any size (18). RST (16) and at two microsatellite loci from the proximal region of the X, outside (␦)2 (15) are related to the stepwise-mutation model (SMM), the Xag inversion. At one of these loci, both M and S molecular which assumes that each allele mutates to either one of the forms differ significantly from A. arabiensis, but remarkably, at the immediately neighboring alleles with equal probability (19). other locus, A. arabiensis is indistinguishable from the M molecular The standard genetic distance D (14) is an often used and form of A. gambiae. These data support the recent proposal of popular parameter for classification and evolutionary studies. It was genetically differentiated M and S molecular forms of A. gambiae. originally defined as an average value over all loci examined, but it can also be defined at each locus separately. Several variations of any major infectious diseases, such as malaria, leishmaniasis, D have been used, for example, DC (20), DA, Dm (14), DSW (17), and Mand sleeping sickness, are transmitted by insect vectors. DLR (9). In a bear study (9), D and DLR were comparably satisfac- Molecular genetic markers have become powerful tools for eluci- tory but failed to resolve the most distantly related pairs of species: dating the population biology and evolution of such vectors, topics when loci have no alleles shared between two populations, D and that are highly relevant to disease transmission in the field (1–4). DLR are not defined or, as has been proposed by Nei (14), take an Genetic variation in vector populations contributes to their suscep- infinite value that is problematical for any quantitative comparison. tibility to infection by the pathogen, their degree of anthropophily, As part of our ongoing studies of A. gambiae taxa and populations, their daily survival and reproductive rates, and the epidemiology of here we compare the performance of presently used parameters of ␦ 2 the disease in the human host (5). A case in point is the African genetic distance [e.g., D, FST, RST, and ( ) ], and we introduce and Ј mosquito of the Anopheles gambiae (sensu latu) complex (5). These compare new parameters, D and DK. By using a battery of four Ј include the most important vector of human malaria, A. gambiae parameters (FST, RST, D , and DK), we identify intriguing differences (sensu strictu), as well as closely related species that are significant in genetic distance between A. arabiensis and the M and S molecular vectors in specific areas (e.g., Anopheles arabiensis) or are alto- forms of A. gambiae, at loci representing different chromosomal gether unable to serve as vectors (Anopheles quadriannulatus). regions. Furthermore, even within A. gambiae s.s., cytologically defined Materials and Methods chromosomal forms (e.g., Mopti, Savanna, and Bamako) are re- productively isolated in the northern dry areas of West Africa, Origin of Mosquitoes. Field-collected female mosquitoes were including Mali and Burkina Faso, and may represent emerging species-identified with molecular markers (21). A total of 268 A. EVOLUTION species with different disease transmission characteristics (5, 6). gambiae were collected in July 1996 in Mali, West Africa: 95 from Although many DNA regions have been recently analyzed to examine genetic differentiation within A. gambiae s.s, the only fixed Abbreviations: IAM, infinite allele model of mutation; SMM, stepwise mutation model; molecular differences found so far that consistently discriminate rDNA, ribosomal DNA. chromosomal forms are in the X-linked ribosomal (r)DNA region §To whom reprint requests should be addressed. E-mail: [email protected] or (1–4, 7). In Mali and Burkina Faso, these markers distinguish Mopti [email protected]. from Savanna and Bamako chromosomal forms; however, when the The publication costs of this article were defrayed in part by page charge payment. This analysis is extended to additional populations in West Africa, two article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. nonpanmictic units are identified even in the absence of chromo- §1734 solely to indicate this fact. www.pnas.org͞cgi͞doi͞10.1073͞pnas.191003598 PNAS ͉ September 11, 2001 ͉ vol. 98 ͉ no. 19 ͉ 10769–10774 Downloaded by guest on September 27, 2021 Fig. 1. Comparison of frequencies of allele sizes at 23 microsatellite loci, in 95 A. gambiae and 81 A. arabiensis mosquitoes (A), as well as at two loci in 77 M- and 94 S-form A. gambiae (B). Because of space limitations, allele spacing has been shortened, and alleles at tails have been combined. The data are presented in full with helpful color views on our web site (http:͞͞www.embl-heidelberg.de͞ExternalInfo͞kafatos͞publications͞PROG͞). Selenkenyi (Sel) and 92 and 81 from Soulouba (Soul) and A. gambiae were 73͞27 in Sel, 7͞93 in Soul, and 17͞83 in Kn, Kokouna (Kn). Twenty of the 81 A. arabiensis were collected respectively. Figs. 1A and 2A are based on all A. gambiae from from the same villages in Mali at the same time as A. gambiae (1, Sel; Figs. 1B,2B, and 3 are based on all M- and S-form 4, and 15 from Sel, Soul, and Kn, respectively). The remaining mosquitoes from Sel and Soul and an additional individuals 36 61 A. arabiensis mosquitoes were collected from Kilifi, Kenya, in from Kn to make the sample sizes comparable. June 1998. A. gambiae mosquitoes from the villages Sel and Soul were also subjected to karyotyping on the basis of polytene Statistical Parameters and Significance Tests. We have introduced chromosome inversions, but because of technical limitations, DK as a normalized measure of differentiation on the basis of only 28, 24, and 11 mosquitoes were identified definitively as Pearson’s correlation coefficient, r, which considers the distri- Mopti, Savanna, and Bamako (6). Use of a PCR restriction bution of alleles in two populations around their respective mean fragment length polymorphism marker (7) unambiguously clas- allele frequency (Table 1). Depending on the degree of freedom sified the A. gambiae specimens as M or S molecular forms, with f, two direct statistical significance tests, Pt and Pf, can be applied. an efficiency of 91%. All mosquitoes were genotyped at micro- Pt is a modified version of Student’s t test, which was originally satellite loci by previously described high-throughput methods introduced by Gosset in 1908 (23) to evaluate the difference (22). All 81 available A. arabiensis were used for Figs. 1–3. between two means. However, it can also be used to evaluate the Because some parameters are sensitive to differences in sample covariance of allele frequencies in two populations around their size, we introduced sample weights for FST and partly for RST mean frequencies, which are assumed to be identical. The null (Table 1) and also used a number of A. gambiae comparable to hypothesis r ϭ 0 supposes, with regard to population compari- that of A. arabiensis. The percentages of M and S molecular form sons, that two analyzed populations are independent (23–25). In 10770 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.191003598 Wang et al. Downloaded by guest on September 27, 2021 Fig. 2. Genetic differentiation at 23 microsatellite loci across the genome on the basis of FST, RST, DЈ, and DK.