Population Genomic Structure and Adaptation in the Zoonotic Malaria Parasite Plasmodium Knowlesi
Total Page:16
File Type:pdf, Size:1020Kb
Population genomic structure and adaptation in the zoonotic malaria parasite Plasmodium knowlesi Samuel Assefaa, Caeul Limb, Mark D. Prestona, Craig W. Duffya, Mridul B. Nairc, Sabir A. Adroubc, Khamisah A. Kadird, Jonathan M. Goldbergb, Daniel E. Neafseye, Paul Divisa,d, Taane G. Clarka, Manoj T. Duraisinghb,d, David J. Conwaya,d,1, Arnab Painc,d,f,1, and Balbir Singhd,1 aPathogen Molecular Biology Department, London School of Hygiene and Tropical Medicine, London, WC1E 7HT United Kingdom; bDepartment of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115; cBiological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Jeddah, Kingdom of Saudi Arabia; dMalaria Research Centre, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Sarawak, Malaysia; eBroad Institute of MIT and Harvard, Cambridge, MA 02142; and fCenter for Zoonosis Control, Global Institution for Collaborative Research and Education, Hokkaido University, N20 W10 Kita-ku, Sapporo, Japan Edited by Xin-zhuan Su, National Institutes of Health, Rockville, MD, and accepted by the Editorial Board September 3, 2015 (received for review May 21, 2015) Malaria cases caused by the zoonotic parasite Plasmodium knowlesi Although P. knowlesi malaria is regarded as an emerging in- are being increasingly reported throughout Southeast Asia and in fection, there clearly have been increased efforts in detection travelers returning from the region. To test for evidence of signa- made since its existence as a significant zoonosis was discovered, tures of selection or unusual population structure in this parasite, we and specific detection has been enhanced by the declining num- surveyed genome sequence diversity in 48 clinical isolates recently bers of human cases caused by other malaria parasites in South- sampled from Malaysian Borneo and in five lines maintained in lab- east Asia (15). Aside from the first two human cases described oratory rhesus macaques after isolation in the 1960s from Peninsular several decades ago (5), there is direct evidence of human Malaysia and the Philippines. Overall genomewide nucleotide diver- P. knowlesi infections from ∼20 y ago in Malaysian Borneo and sity (π = 6.03 × 10−3) was much higher than has been seen in world- Thailand obtained by retrospective molecular analysis of ma- wide samples of either of the major endemic malaria parasite species terial from archived blood spots and slides (10, 16), and mo- Plasmodium falciparum and Plasmodium vivax. A remarkable sub- lecular population genetic evidence indicates the zoonosis has GENETICS structure is revealed within P. knowlesi, consisting of two major been in existence for a much longer time (11). The genetic sympatric clusters of the clinical isolates and a third cluster compris- diversity of P. knowlesi ishighwithinhumansaswellasmacaques, ing the laboratory isolates. There was deep differentiation between with sequence data on three loci [the circumsporozoite protein the two clusters of clinical isolates [mean genomewide fixation in- (csp) gene, 18S rRNA, and mtDNA genome] indicating extensive P knowlesi dex (FST) = 0.21, with 9,293 SNPs having fixed differences of FST = shared polymorphism and no fixed differences between . 1.0]. This differentiation showed marked heterogeneity across the parasites from humans and monkeys sampled in the same area in genome, with mean FST values of different chromosomes ranging Sarawak, Malaysian Borneo (6, 11). Analysis of samples from a from 0.08 to 0.34 and with further significant variation across re- smaller number of humans and monkeys in Thailand showed al- P knowlesi msp1 gions within several chromosomes. Analysis of the largest cluster leles of the . merozoite surface protein 1 ( )gene (cluster 1, 38 isolates) indicated long-term population growth, with to be similarly diverse in both hosts (16), and there were shared negatively skewed allele frequency distributions (genomewide av- erage Tajima’sD= −1.35). Against this background there was evi- Significance dence of balancing selection on particular genes, including the circumsporozoite protein (csp) gene, which had the top Tajima’sD Genome sequence analysis reveals that the zoonotic malaria value (1.57), and scans of haplotype homozygosity implicate several parasite Plasmodium knowlesi consists of three highly divergent genomic regions as being under recent positive selection. subpopulations. Two, commonly seen in sympatric human clini- cal infections in Malaysian Borneo, were identified in a previous population genomics | Plasmodium diversity | reproductive isolation | study as corresponding to parasites seen in long-tailed and pig- zoonosis | adaptation tailed macaque hosts, respectively. A third type has been de- tected in a few laboratory-maintained isolates originally derived he zoonotic malaria parasite Plasmodium knowlesi is a signif- in the 1960s elsewhere in Southeast Asia. Divergence between Ticant cause of human malaria, with a wide spectrum of clinical the subpopulations varies significantly across the genome but outcomes including high parasitemia and death (1–4). Long overall is at a level indicating different subspecies. Analysis of known as a malaria parasite of long-tailed and pig-tailed macaques the diversity within the most common type in human infections (5), the first large focus of human cases was described only in 2004 shows strong signatures of natural selection, including balancing in the Kapit Division of Sarawak in Malaysian Borneo (6). Since selection and directional selection, on loci distinct from those then infections have been described from almost all countries in under selection in endemic human malaria parasites. Southeast Asia (2, 7). Travelers to the region from Europe, North Author contributions: D.J.C., A.P., and B.S. designed research; S.A., C.L., M.B.N., S.A.A., America, and Australasia also have recently acquired P. knowlesi K.A.K., D.E.N., M.T.D., D.J.C., A.P., and B.S. performed research; S.A., M.D.P., C.W.D., J.M.G., malaria (7, 8). Until the application of molecular assays for D.E.N., P.D., T.G.C., M.T.D., D.J.C., A.P., and B.S. analyzed data; and S.A., D.J.C., A.P., and B.S. specific detection, human P. knowlesi malaria was largely wrote the paper. misdiagnosed as Plasmodium malariae, a morphologically similar The authors declare no conflict of interest. but distantly related species (1, 6, 9, 10). Studies in the Kapit This article is a PNAS Direct Submission. X.S. is a guest editor invited by the Editorial Division of Sarawak in Malaysian Borneo have indicated that Board. P. knowlesi malaria is primarily a zoonosis with macaques as res- Freely available online through the PNAS open access option. ervoir hosts (11) and that the forest-dwelling mosquito species Data deposition: The sequences reported in this paper have been deposited in the European Anopheles latens P knowlesi Nucleotide Archive (accession no. PRJEB10288) and NCBI database (accession no. SRP063014). is the local vector for . (12). Other 1 Anopheles leucosphyrus To whom correspondence may be addressed. Email: [email protected], arnab. members of the group are vectors in dif- [email protected], or [email protected]. ferent parts of Southeast Asia and may determine the geo- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. graphical distribution of transmission (13, 14). 1073/pnas.1509534112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1509534112 PNAS Early Edition | 1of6 Downloaded by guest on September 30, 2021 polymorphisms of the csp gene in parasites from a few infections (21.7 Mb of the 23.5-Mb reference genome), including 93% of examined in humans and macaques in Singapore (17). Recent all predicted protein-coding genes (4,860 genes). Overall, multilocus microsatellite analysis has indicated a deep population 975,642 biallelic high-quality SNPs were identified among the P knowlesi 53 isolates, indicating a very high level of nucleotide diversity subdivision in . associated with long-tailed and pig-tailed − − [genomewide π = 6.03 × 10 3 (genic π = 4.84 × 10 3; noncoding macaques; both major types infect humans and occur sympatrically − at most sites in Malaysia, but the two types show some additional π = 7.50 × 10 3)]. Almost half of the SNPs (423,160; 43.4% of the geographical differentiation across sites (18). total) had a minor allele that occurred in only one isolate, and Human populations have grown very rapidly in the Southeast 21.8% (n = 213,375) had minor allele frequencies of above 10% in Asian region and encroach on most of the wild macaque habi- the overall sample of isolates. tats, so it is vital to know if P. knowlesi parasites are adapting to All individual isolates were very distinct from one other (Fig. human hosts or to anthropophilic mosquito vector species, either 1) except for two of the laboratory lines [labeled “H(AW)” and of which could cause human–mosquito–human transmission. Initial “Malayan”], which were virtually identical to each other and to analysis of the P. knowlesi reference genome sequence (strain H) the published H strain reference genome sequence (19). Only highlighted some unique features of the genome of this species 251 SNPs were identified among these three sequences [numbers = (19), namely, schizont infected cell agglutination variant (SICAvar) of pairwise differences: H reference vs. Malayan 226; H ref- = = and knowlesi interspersed repeat (KIR) variant antigen genes,