Gross et al. BMC Ecol Evo (2021) 21:139 BMC Ecology and Evolution https://doi.org/10.1186/s12862-021-01872-z

RESEARCH Open Access Evolution of transcriptional control of antigenic variation and virulence in human and ape parasites Mackensie R. Gross, Rosie Hsu and Kirk W. Deitsch*

Abstract Background: The most severe form of human malaria is caused by the protozoan parasite falciparum. This unicellular organism is a member of a subgenus of Plasmodium called the Laverania that infects apes, with P. falciparum being the only member that infects humans. The exceptional virulence of this species to humans can be largely attributed to a family of variant surface antigens placed by the parasites onto the surface of infected red blood cells that mediate adherence to the vascular endothelium. These are encoded by a large, multicopy family called var, with each var gene encoding a diferent form of the . By changing which var gene is expressed, parasites avoid immune recognition, a process called antigenic variation that underlies the chronic nature of malaria . Results: Here we show that the common ancestor of the branch of the Laverania lineage that includes the human parasite underwent a remarkable change in the organization and structure of elements linked to the complex transcriptional regulation displayed by the var gene family. Unlike the other members of the Laverania, the clade that gave rise to P. falciparum evolved distinct subsets of var distinguishable by diferent upstream transcriptional regulatory regions that have been associated with diferent expression profles and virulence properties. In addition, two uniquely conserved var genes that have been proposed to play a role in coordinating transcriptional switching similarly arose uniquely within this clade. We hypothesize that these changes originated at a time of dramatic climatic change on the African continent that is predicted to have led to signifcant changes in transmission dynamics, thus selecting for patterns of antigenic variation that enabled lengthier, more chronic infections. Conclusions: These observations suggest that changes in transmission dynamics selected for signifcant alterations in the transcriptional regulatory mechanisms that mediate antigenic variation in the parasite lineage that includes P. falciparum. These changes likely underlie the chronic nature of these infections as well as their exceptional virulence. Keywords: Cytoadherence, Transcriptional regulation, Mutually exclusive expression, Pathogenesis

Background resulting in signifcant morbidity and mortality within Eukaryotic parasites of the genus Plasmodium infect a human populations in tropical and subtropical regions of broad range of vertebrate species, including birds, rep- the world [2]. Malaria parasites exhibit multiple morpho- tiles and mammals [1]. Tere are fve species that infect logical states as they transition between their mosquito humans and cause malaria, a disease that can be severe, vectors and their vertebrate hosts, but all symptoms of the disease result from asexual replication of the para- *Correspondence: [email protected] sites within the circulating red blood cells (RBCs) of the Department of Microbiology and Immunology, Weill Cornell Medical infected individual. Tis stage of the is associ- College, New York, NY, USA ated with a massive increase in parasite numbers and can

© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat​ iveco​ mmons.​ org/​ licen​ ses/​ by/4.​ 0/​ . The Creative Commons Public Domain Dedication waiver (http://creat​ iveco​ ​ mmons.org/​ publi​ cdoma​ in/​ zero/1.​ 0/​ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Gross et al. BMC Ecol Evo (2021) 21:139 Page 2 of 15

lead to severe anemia, extensive infammation and vascu- the infected cell surface and thereby undergo antigenic lar occlusion that can disrupt circulation [3]. Te infu- variation. Whole genome analysis of many P. falciparum ence of malaria on human evolution is thought to have isolates also indicates that the var gene family is highly been substantial, and infection by Plasmodium parasites recombinogenic [16, 17], resulting in rapid diversifca- has been proposed to have played a signifcant role in tion of the EMP1 coding regions and thus preventing the shaping the [4]. acquisition of cross-reactive immunity in geographical Of the fve parasite species that infect humans, by far regions with high transmission rates. Te var gene family the most virulent is P. falciparum. Te virulence of this therefore represents a key attribute of the Laverania that species is due in large part to the cytoadhesive proper- has substantially contributed to its evolutionary success ties of infected RBCs that result from the placement on and the increased virulence displayed by P. falciparum the RBC surface of parasite encoded variable antigens, compared to other human-infective species of malaria included proteins referred to as RIFIN (repetitive inter- parasites. spersed family), STEVOR (subtelomeric variant open Te var gene family is present in the genomes of all reading frame), PfMC-2TM (P. falciparum Maurer’s seven known members of the Laverania subgenus, indi- cleft-2 transmembrane domain proteins) and PfEMP1 cating that it evolved in the common ancestor of these (P. falciparum erythrocyte membrane protein 1) [5, 6]. parasites. Interestingly however, in their recent compara- Te contribution of RIFINs, STEVORs and PfMC-2TMs tive study, Otto and colleagues documented a remarkable to cytoadherence is unclear, however PfEMP1 has been change in the organization of protein domain structures shown to be the major cytoadhesive molecule and is within the EMP1 proteins encoded by diferent clades thought to be the primary virulence determinant of P. within the Laverania subgenus [12]. Specifcally, they falciparum [7]. P. falciparum is the sole human-infective found that four species that constitute a clade which member of a subgenus of Plasmodium that infects apes includes P. falciparum have diverged signifcantly in called the Laverania. Unlike malaria species from other EMP1 protein structure, suggesting that parasites from evolutionary lineages, the seven characterized species of this clade cytoadhere to diferent host endothelial recep- the Laverania are unique for their expression of EMP1 on tors when compared to the more distantly related mem- the surface of infected RBCs [7] where it binds to mol- bers of the subgenus. If correct, this marks a signifcant ecules on the host endothelial surface and enables the moment in the evolution of this lineage of pathogens, infected cell to adhere to blood vessel walls, thus seques- particularly since this alteration in cytoadhesive prop- tering the parasites from the peripheral circulation and erties directly shaped the virulence of the human para- avoiding destruction in the spleen [8]. However, place- site, P. falciparum. Te selective forces that led to this ment of EMP1 on the RBC surface stimulates the host to change in EMP1 structure are unknown, however Otto generate antibodies against this protein, thus parasites et al. found that an unusual and highly conserved var must continuously vary the form of EMP1 they express gene called var2csa appears to be the sole remnant of the to avoid clearance by the adaptive immune response. Tis more ancient type of EMP1 remaining in P. falciparum. phenomenon is referred to as antigenic variation, and it Tis gene was originally identifed as encoding a form of is remarkably efective at enabling parasites to maintain EMP1 that binds to receptors in the placenta [18], but chronic infections that can last over a year [3, 9–11]. more recently has also been proposed to function as an Te ability of the Laverania parasites to continuously intermediate in var transcriptional switching [19, 20]. change the expressed form of EMP1 over the course of an Tey therefore hypothesized that this ancient gene has infection is rooted in the expansion of the genes encoding been maintained as a single copy in P. falciparum to serve diferent forms of the protein into large, multicopy gene the dual functions of cytoaherence within the placenta as families called var. A recent comparative study of the well as regulating var gene transcriptional switching [12]. genomes of seven Laverania species identifed var gene Tis study highlighted key steps in the evolution of the families ranging in size from 28 to 105 copies, with the var gene family, both in terms of the virulence linked to genes organized into clusters found within the subtelo- cytoadherence by EMP1 and the transcriptional regula- meric regions near the ends or in tandem tion that enables antigenic variation. arrays within the internal regions of the Here we extend these fndings to investigate the evo- [12]. Extensive studies in P. falciparum indicate that the lution of other elements implicated in transcriptional genes are transcribed in a mutually exclusion fashion, regulation of the var gene family. We found that both with all but a single gene maintained in condensed het- the regulatory proteins previously implicated in the erochromatin and thus transcriptionally silent [13–15]. control of var gene transcription and a unique regu- By switching the single var gene that is actively tran- latory element found within the conserved intron of scribed, parasites change the form of EMP1 expressed on var genes are present throughout the entire Laverania Gross et al. BMC Ecol Evo (2021) 21:139 Page 3 of 15 PTEF PADL01_0201200 PGABG01_0200100 * PBILCG01_0205100 PRG01_0201700 PPRFG01_0202500 PF3D7_0202400 missing missing missing JmJC1 PADL01_0809000 PGABG01_0809800 PBLACG01_0809600 PBILCG01_0815300 PRG01_0812700 PPRFG01_0811300 PF3D7_0809900 PVX_123283 missing PGAL8A_00526200 SET2 PADL01_1321000 PGABG01_1320200 PBLACG01_1320300 PBILCG01_1322500 PRG01_1324200 PPRFG01_1324000 PF3D7_1322100 PVX_116765 missing PGAL8A_00234100 RQ1 PADL01_0917600 PGABG01_0916000 PBLACG01_0918200 PBILCG01_0922000 PRG01_0926000 PPRFG01_0917400 PF3D7_0918600 PVX_099345 PBANKA_0819500 PGAL8A_00431400 WRN PADL01_1429900 PGABG01_1428600 PBLACG01_1429300 PBILCG01_1430500 PRG01_1429900 PPRFG01_1430600 PF3D7_1429900 PVX_085045 PBANKA_1014800 PGAL8A_00301500 gene regulation Sir2B PADL01_1451400 PGABG01_1450200 PBLACG01_1450800 PBILCG01_1450900 PRG01_1451400 PPRFG01_1452100 PF3D7_1451400 PVX_117910 PBANKA_1315100 PGAL8A_00210600 Sir2A PADL01_1327800 PGABG01_1327000 PBLACG01_1327100 PBILCG01_1329300 PRG01_1331000 PPRFG01_1330800 PF3D7_1328800 PVX_082385 PBANKA_1343800 PGAL8A_00241100 SET10 PADL01_1221600 PGABG01_1219800 PBLACG01_1220400 PBILCG01_1221100 PRG01_1224300 PPRFG01_1233800 PF3D7_1221000 PVX_123685 PBANKA_1436200 PGAL8A_00535100 Orthologous in var implicated genes encoding proteins - nowi parum rum ceum 1 Table genomic positions the syntenic at SET2 Genes encoded not found name. and JmJC1 are under each protein in columns listed each protein are on the left listed number for and the gene annotation species names are The blacklocki because the not be determined could of PTEF in P. existence text). *The with italicized (denoted gallinaceum or P. berghei P. vivax, in either P. not found PTEF are genes encoding , and syntenic berghei in P. this species assembly for genome sequence in the most recent of the genome is incomplete region syntenic P. alderi P. P. gaboni P. P. blacklocki P. P. billcollinsi P. P. reiche P. - praefalci P. - falcipa P. P. vivax P. P. berghei P. P. gallina - P. Gross et al. BMC Ecol Evo (2021) 21:139 Page 4 of 15

subgenus, indicating that many aspects that control var vivax and the bird parasite P. gallinaceum, however they gene expression evolved in the common ancestor of this are not encoded in the rodent parasite genomes (Table 1). entire group of parasites. In contrast, the divergence of It was previously suggested that these proteins were spe- var upstream regulatory regions into distinct subtypes cifc to primate parasites [22], however the availability of linked to virulence, as well as the evolution of var2csa whole genome sequences from additional parasite spe- and a second strain-transcendent var gene called var1csa, cies has identifed syntenic orthologues in parasites of occurred relatively recently within the Laverania subge- birds, suggesting an origin prior to the root of the Plas- nus, within the clade that includes P. falciparum. Tese modium genus and more general functions not specifc changes all appear to be related to the evolution of a for var gene regulation. more structured transcriptional regulatory mechanism, In contrast to the other regulatory proteins associated potentially enabling parasites to more precisely coor- with var gene regulation, PTEF (Plasmodium translation dinate the process of antigenic variation. Phylogenetic enhancing factor) appears to be specifc to the Lavera- analysis indicates that this change in the structure of the nia (Table 1). BLAST searches of the genomes of non- var gene family coincides with a well-documented altera- Laverania parasite species failed to identify genes with tion in African climatic conditions that likely altered signifcant similarity, and the syntenic genomic positions parasite transmission dynamics. Tus, the selective pres- of P. vivax, P. berghei and P. gallinaceum do not contain sures that shaped the evolution of virulence of the sole open reading frames encoding similar proteins. Tus, human-infective member of the Laverania lineage can be the evolution of PTEF appears to be coincident with the traced to two key events, one involving a change in host appearance of var genes in this branch of the Plasmo- receptor recognition for cytoadherence and a second that dium genus. In P. falciparum, PTEF has been described infuenced how antigenic variation is controlled. as a protein specifcally upregulated during pregnancy where it enhances translation of the EMP1 encoded by Results the unique var gene called var2csa [28]. var2csa is unu- Proteins implicated in var gene regulation are found sual in that the gene includes a regulatory upstream open encoded in the genomes of all Laverania species reading frame (uORF) that can suppress translation of Given the importance of var genes and EMP1 to the vir- the EMP1 encoding exons [31, 32]. PTEF was shown to ulence of P. falciparum, and the recent identifcation of interact with the translating ribosome complex to ena- signifcant changes in EMP1 structure over the course of ble translational reinitiation [28], thereby overcoming the evolution of the Laverania [12], we were interested the translational repression of the uORF and promoting in systematically considering the evolution of the mech- expression of a form of EMP1 that binds to the proteogly- anisms controlling var gene regulation and antigenic can chrondroitin sulfate A on the syncytiotrophoblasts variation in the parasite lineage that gave rise to P. falci- of the placenta [33]. var2csa was therefore proposed to parum. While the precise mechanisms that control var be translationally repressed in the absence of a placenta gene activation, silencing and mutually exclusive expres- through the uORF, but expression of PTEF when para- sion are not completely understood, numerous proteins sites infect pregnant women relieved this repression and have been implicated in various aspects of var gene regu- enabled parasites to take advantage of this new anatomi- lation, included the histone methyltransferases SET2 [21, cal site for cytoadhesion. However, var2csa and its regu- 22] and SET10 [23], the demethylase JmJC1 [22, 24], the latory uORF appear to have evolved relatively late in the histone deacetylases SIR2A and SIR2B [25–27], the trans- Laverania (see below), while PTEF is present in all mem- lation factor PTEF [28] and the RECQ helicases RQ1 and bers of the subgenus, indicating that PTEF must have a WRN [29, 30]. Syntenic orthologues of each protein are function in addition to or beyond regulation of var2csa found throughout the Laverania, and with the exception translation. A recent genome-wide assessment of P. fal- of PTEF, all can also be identifed in the non-Laverania ciparum transcripts noted that the number of potential parasites P. vivax, P. berghei and P. gallinaceum (Table 1), uORFs is remarkably high [34] and given the substantial therefore they must have existed in the common ances- degree of synteny and sequence homology within the tor of all Plasmodium species. Tese proteins represent entire subgenus, this extends to the other Lavarania spe- orthologues of factors known from model organisms to cies as well. Te role of PTEF in overcoming the repres- play roles in chromatin assembly, transcriptional regula- sive efects of uORFs on mRNA translation therefore tion or aspects of DNA repair, thus they are likely to have could include a broader class of mRNAs beyond var2csa, broader roles in parasite nuclear functions beyond their thus explaining its conservation in parasites that do not functions in var gene regulation. Te histone methyl- possess this particular var gene. transferase SET2 and its cognate demethylase JmJC1 are similarly found throughout the Laverania as well as in P. Gross et al. BMC Ecol Evo (2021) 21:139 Page 5 of 15

A region 1region 2 region 3 region 1 region 2 region 3 0.8 0.8 0.8 exon1 exon 2 0.6 0.6 0.6

~0.8 - 1.2 kb P. billcollinsi 0.4 0.4 0.4

0.2 0.2 0.2 region 1 region 2 region 3 B 0.0 0.0 0.0 GATC GATC GATC 0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6

P. falciparum 0.4 0.4 0.4 P. blackloci 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 0.0 0.0 GATC GATC GATC GATC GATC GATC

0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6

P. praefalciparum P. gaboni 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 0.0 0.0 GATC GATC GATC GATC GATC GATC

0.8 0.8 0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6 0.6 0.6

P. reichenowi P. alderi 0.4 0.4 0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0 0.0 0.0 GATC GATC GATC GATC GATC GATC Fig. 1 Conservation of var intron structure in all seven Laverania species. A Schematic showing the division of var introns into three regions based on base content and strand asymmetry. B Calculation of base content of the positive strand of var introns for each species of the Laverania. Note the prominent G vs C asymmetry in regions 1 and 3 and the conservation of this asymmetry in all seven species

The regulatory element found in var introns is conserved the display of G vs C and A vs T content within the sepa- throughout the Laverania subgenus rate regions, with region 1 containing high G content Initial characterization of var genes in P. falciparum iden- and virtually no Cs and region 3 displaying the inverse tifed a conserved, two exon structure that includes an relationship, with respect to the positive DNA strand unusual intron that separates the extracellular domain (Fig. 1). A bi-directional promoter driving expression of of EMP1 from the intracellular portion that anchors long, noncoding RNAs has been mapped to region 2 [36, the protein in the RBC membrane [35]. Tis intron has 41]. Tis conserved structure is found within the introns independent promoter activity, giving rise to long, non- of all var genes in P. falciparum, with the exception of an coding RNAs of unknown function [35–37], and it has unusual type called var3 or “type 3” var genes. Type 3 var been implicated in the regulation of var gene silencing genes have been identifed in some, but not all, strains of and mutually exclusive expression [38, 39], although the P. falciparum, and their function is unknown [42–44]. precise mechanism remains undefned [40]. Close exami- Te introns of these genes lack region 2. nation of the sequence of var introns identifed a distinct To determine if the regulatory element found in the strand asymmetry, with each intron easily divided into var introns of P. falciparum is also observed in var genes three well defned regions based on base content, with regions 1 and 3 displaying “mirror images” of one another [41]. Te strand asymmetry is easily discernible through Gross et al. BMC Ecol Evo (2021) 21:139 Page 6 of 15

AB

V Laverania V

chromosome 4 P. alderi 59%

V

V V V V

P. gaboni chromosome 4

53%

V V V V

P. blacklocki chromosome 14

43%

V V

P. billcollinsi chromosome 14

39%

V V

P. reichenowi chromosome 5

93%

V V V V

P. praefalciparum chromosome 5

92% V

V V V V

P. falciparum chromosome 5 V Fig. 2 Conservation of the unusual var gene var1csa in P. reichenowi, P. praefalciparum and P. falciparum. A A phylogenetic tree of the Laverania as described by Otto et al. [12]. B Identifcation of var1csa in P. falciparum (isolate SN1) and its orthologues in P. reichenowi and P. praefalciparum. The chromosomal region from each species containing the var gene with the highest sequence identity to var1csa from P. falciparum is shown. var genes are shown in blue and members of other multicopy gene families are shown in green. The genes with the highest sequence identity to var1csa in each species are aligned, and the percentage identity is shown below the gene. For P. reichenowi and P. praefalciparum, sequence identity exceeds 90%, and the genes display clear synteny (boxed in blue). For the remainder of the Laverania, the sequence identity is substantially lower and the most similar gene is not located in the syntenic position, indicating these species do not contain true orthologues of var1csa

from parasites throughout the Laverania subgenus, we genome assemblies of these species might enable identif- obtained var intron sequences from annotated var genes cation of var genes that are currently missing. from the 6 other Laverania species. Each intron sequence was divided into three regions and the average base con- Evolution of strain‑transcendent var genes in a subset tent calculated for each region as previously described of Laverania species that includes P. falciparum [41]. Te content for each region was displayed Te var gene family has been extensively studied in P. as histograms (Fig. 1) and all seven species show virtu- falciparum, including the vast diversity of individual ally identical patterns of strand asymmetry. Te unusual genes when the complete genome sequences of multiple, base pair structure of var introns originally observed in P. independent geographical isolates were compared [44]. falciparum therefore is also found in var genes through- Attempts to measure the level of var sequence diver- out the Laverania and likely extends to the origins of var sity within populations of naturally circulating parasites genes themselves. With the exception of the Type 3 var have similarly described enormous variation [9, 47] that genes mentioned above, the intron structure does not appears to result from frequent segmental recombina- appear to vary based on the chromosomal location or tion events between genes [48, 49], a phenomenon docu- gene type, consistent with what was previously reported mented in cultured parasites [50–52]. Remarkably, two for P. falciparum [45], Interestingly, similar to a previous var genes appear to escape recombination with other report [46], we were unable to identify Type 3 vars in spe- members of the var gene family and remain highly con- cies other than P. falciparum, although more complete served in all P. falciparum isolates. Te frst of these genes, called var1csa, is found near the “right” end of Gross et al. BMC Ecol Evo (2021) 21:139 Page 7 of 15

A P. reichenowi

chromosome 12 P. praefalciparum

V

V V V V V V

chromosome 12 SN01

V V V

V V

chromosome 12 7G8

V

V V V V

chromosome 12 3D7 V var2csa B

P. praefalciparum SN01 3D7 P. reichenowi 7G8

P. praefalciparum SN01 3D7 P. reichenowi 7G8

Fig. 3 Conservation of the var gene var2csa in P. reichenowi, P. praefalciparum and P. falciparum. A Schematic showing the syntenic region of chromosome 12 in P. reichenowi, P. praefalciparum and three isolates of P. falciparum (SN01, 7G8 and 3D7). The positions of the single copy genes fkk12 and acs7 are shown and represent the border between the core genome and the highly variable subtelomeric domain. The orthologues of var2csa are aligned and boxed in blue. B Alignment of the amino acid sequence encoded by the uORF of var2csa from P. praefalciparum, P. reichenowi and three isolates of P. falciparum (SN01, 7G8 and 3D7)

chromosome 5 at the boundary between the subtelom- if var1csa might exist at an alternative genomic posi- eric region and the core genome. Unlike other members tion only identifed non-conserved var genes with sub- of the var gene family, it is transcribed late in the asexual stantially less sequence identity (Fig. 2), indicating that cycle and appears to not be subjected to mutually exclu- var1csa is limited to P. praefalciparum, P. reichenowi and sive expression [53, 54]. In addition, in all isolates exam- P. falciparum. Interestingly, the gene in P. praefalciparum ined to date, the gene carries a premature stop codon in and P. reichenowi also contains a predicted premature exon 2, leading to its annotation as a pseudogene, and in stop codon in exon 2. Te selective pressure maintaining the 3D7 reference genome the gene is truncated due to this unique var gene is unknown but seems unlikely to be a subtelomeric deletion event. To determine if the con- related to its protein coding capacity given the unusual servation of var1csa extends beyond P. falciparum, we transcription pattern displayed by the gene in P. falcipa- examined the syntenic region of chromosome 5 in all 7 rum as well as the premature stop codon. members of the Laverania subgenus. We found var genes Similar to var1csa, var2csa is conserved in the genomes with 92% and 93% sequence identity at this genomic posi- of all P. falciparum isolates described to date [16]. Te tion in P. praefalciparum and P. reichenowi, respectively, gene is duplicated in some isolates of P. falciparum and but no genes with similar levels of identity in any other was also observed as multiple copies in the sequenced species (Fig. 2). BLAST searches of the complete genome isolate of P. praefalciparum [44, 55], but all isolates of sequences from the remaining species to determine P. falciparum maintain a syntenic copy at the boundary Gross et al. BMC Ecol Evo (2021) 21:139 Page 8 of 15

between the subtelomeric region and the core genome at epigenetically regulated activation and silencing, as well the “left” end of chromosome 12 (Fig. 3A). Also similar as mutually exclusive expression. However, more detailed to var1csa, var2csa is conserved in P. praefalciparum and analysis of the upstream regulatory regions of all mem- P. reichenowi, specifcally a conserved copy is located at bers of the family found that they could be organized the syntenic position of the left end of chromosome 12, into three basic types, called UpsA, B or C, based on but no copies of var2csa are found in the other mem- DNA sequence similarity [45, 57], and that these types bers of the Laverania subgenus. As mentioned above, correlated closely with the genes’ position within the var2csa is unique amongst var genes in its inclusion of chromosomes. Two additional types, called UpsD and a 360 bp uORF within the 5’ leader of its mRNA. Tis E, are associated with var1csa and var2csa, respectively. uORF serves as a translational repressor of the EMP1 Te importance of Ups type was suggested by studies of encoded by the main open reading frame [31, 32] and is clinical samples obtained from individuals with either also highly conserved in all three species. Tis conserva- severe or asymptomatic disease. Parasites isolated from tion extends to the amino acid sequence of the peptide children with severe disease were most often found to encoded by the uORF (Fig. 3B). Tis gene was originally express UpsA or B genes [58–60], while UpsC genes were identifed for the role of the encoded form of EMP1 in not specifcally associated with disease condition. Studies binding to the host cell surface receptor chrondroitin sul- with cultured parasites found that UpsA and B genes also fate A in the placenta [18], however more recently it has displayed higher switching rates than observed for UpsC been suggested to play a role in mediating or coordinat- genes, which are more stable once activated [61, 62]. ing var gene expression switching, potentially as a default Tese studies suggest that the upstream transcriptional gene or node in a hypothetical var gene switching net- regulatory regions have evolved into diferent types to work [19, 20, 22]. In a more recent analysis of the domain infuence how expression of the gene family progresses structure of EMP1 proteins throughout the Laverania, over the course of an infection. Otto and colleagues identifed a surprising change in Given the potential importance of the organization of structure of the domains that make up EMP1 in the the upstream regions for regulating var gene expression clade that includes P. billcollinsi. P. reichenowi, P. prae- in P. falciparum, we investigated when this organization falciparum and P. falciparum [12]. However, the EMP1 evolved in the Laverania lineage and whether a similar encoded by var2csa appears to be more closely related to organization exists in other members of the subgenus. those found in the more distantly related Laverania par- Similar to the previous analyses that identifed the UpsA/ asites, leading them to suggest that it might be the sole B/C/D and E types [45, 57], we constructed maximum- remnant of an ancient form of the protein that remains likelihood phylogenetic trees to identify subgroups of within the genomes of these parasites [12]. Tey further upstream regulatory domains based on sequence simi- suggested that the selective pressure that has prevented larities. Using this method, we readily detected the three this particular gene from recombining with other mem- primary types (UpsA/B/C) in P. falciparum, and similarly bers of the family could be related to its putative role in found them to generally correlate with genomic posi- transcriptional regulation rather than in the cytoadher- tion, thus replicating the previously published analyses ence properties of the encoded protein. If correct, this [45, 57] (Additional fle 1, Fig. 1A). Applying the same suggests that the function of var2csa in regulating var methodology to the var gene upstream regions from P. gene transcription evolved in the common ancestor of praefalciparum and P. reichenowi, we were similarly able P. reichenowi, P. praefalciparum and P. falciparum, and to detect clearly diferentiated upstream types, yielding implicates the uORF in this function. phylogenetic trees with separated branches that again loosely correlated with genomic position (Additional Divergence of var gene regulatory regions into diferent File 1, Fig. 1B, C) suggesting that in these species the types var gene family is organized into Ups types similar to P. With the initial completion and assembly of the P. falci- falciparum. For the other four Laverania species, fewer parum reference genome [56], it became evident that var full-length upstream var sequences were available, how- genes are typically found in one of three genomic posi- ever similar analysis yielded trees with less well defned tions: in subtelomeric regions transcribed away from the branches and with the upstream regulatory regions from telomere, in subtelomeric regions transcribed toward the the diferent chromosomal locations distributed through- telomere, or in tandem arrays found within the interior out the trees (Additional fle 1, Fig. 1D–G), suggesting of the chromosomes. Te signifcance of this arrange- that these regulatory regions are either not organized ment was not immediately apparent since all var genes, into separate Ups types, or that the organization is less regardless of their position in the genome, appear to be strictly defned. subject to similar transcriptional regulation, including Gross et al. BMC Ecol Evo (2021) 21:139 Page 9 of 15

P P

PRFG01 F PR UpsA

PPR

PPRF P G

PPRFG01 00 PF3 PRFG

PPRFG01 053670

PPR 0

RF F PP

PP 1

1 PRG01 PPRFG0

PPRFG G0 080020 D7

0098600

1

PF3D

G01 050020 P

P

RFG01 0011 1

1 FG01 0058800 PPRF

PR

0

P 5

G PR

PF

P

G

1

13 1

R 1100200 7 F3D7 0400 F3D7

3

0 3 FG

D7

PF

P

0026700 0

F

75 1 0

PGABG 01 P

F

P P

0 P

5 11

3D7 0425800 3D7

1 G

0

3D

1 G0 RG

2 03 01 0 R 01 PRFG 1

PA 01

PADL01 40 RG 0400

0 1

0 2790 8 G

0 7

06300 0

1 0

3

0 0

04

D 0011

1300

1

0 PP

01

0

0 1

0 0

L01 00 2

P 00 400 0

01 10

7 0 0 6

PP R

R 0

0 PF 038 5300 8300 0

3 3

PRCD 34 30 0

CD 0 FG 400

13

12 9046

0 3

121980 3 P RF 0 0

0

D7 3 113700

PRFG 8 0

1 0 0

PBLA C 00 01 0 0 0 G01 00 830 PR 00 var2csa PF

8000 0 C 002 0 40

1 0 0 3 00 1

0 0 0 0 G0 0

C 0 PR 1 D7 06004010 0 P 3 93 2

PADL01 G 0 0 0 0 PAD GABG0 PRG01 CD 0 08 000 5 P 1 140 0 0 0 01 8 0 0 3 0 R 0 0300100 0 (UpsE) 0 0

PF3D7C 093760 9 0 07

0036 6 0 00 0 0 0 0 0 PADL01 4 G 1

L 0 0 0 0 30 * 70 0 0 0 0 01 8100 00 P 0 90 2403 0

PA 1 1 04 PR RG 01 0 3 1 0 7 4 0 0 PBILCG01 03 P 70 0 0 1 09044 1 001

4 0 0 3 RG01 1 PPRFG0 00 0 32 7 PGABG01 0808700 G0 0 A D 7 1 PRG01 0 0 0

P 2 7 PPRFG0 0 102000 0 2 0 0 00 P D 1 00 0 DL 6 L 11 11 G BI PPRFG 1 1 7 00 0 42380 L 0 122010 0 0 0 R 0 0 0 1 30 20 0 0 0 *RFG0 CG P 1 4 PF3 1 B P 04 071 1 1 1 P P 0 0 55 1720 1218600 00 01 140ILC 0 00 C P B 1 0 0 0 0 0 00 100 8 7700 000 I RG01 0 2 L 2 18 1 0 0 0 D C 4 7 080910 CG0 6000 G CG01 0 0 00 PRG0 02900 00

0 5 P 0 IL 731 000 900 C 01 100010 3 00 0 17 0 40 0 3D 1 100 00 1 4 PRCDC 0 0 0 0 0 C 003

2 R PB 100 30 0 0 30 0 F 570 0 81 42 P 0 D PPRFG01 0 P 0 0 0 0 C PRCD C 01 1254 PBLACG0 P 6 0 1 0419 0 0 7 3 0 24 var1csa 0 D 00 1 BLA 8 PG 0 1 1 1 8 4 1 200 PR 9 0 00 P PBLACG01 040980 3 8 61 C RG01 2 B 6 P PRG01 00 0 2 1 0904 00 AB 4 P G01 0600 LA C 0 7 AD 0 0 4 G0 RG01 07101 RFG0 01 PBLAC 0 1 0 0 42 PR 0616 PPRFG P G 3 01 R P 1 CG01 PBLA0 0 G 01 0027000 0 RF AD 01 L 1 1 PP 124 0 1 1 P PRG01 P 01 PPRFG7 120 01 G 0 00 1 0 0 0 RG 1 P D (UpsD) L01 0402400 0 0412 1 0 10 0 PB C R 0 G G G P 8 G 4 0 G 01 122290 0 PRG01 00 PF3 00 4 G0 P 00 0 RFG01 08 L 0 10400 4 G 00 AC 0 10 PRFG 6 P 0 1 1 PR PR PR P 9 07 0 P 30 0 0 3 R PR RG 1 G01 14804 0113 0 G 6 0 9 P P 1 0 01 121130 41 1 0 00 12 0 PR 6 G C 00 PA 97 9 0 30150 1 0630000 00 0 01 1243600 R 0 RCD 00 0 D 0 G01 P P 01002 022200 L01 06PBLACG01 G C 40 9 00 0 L DC 0532000 34500 0 3 01 00 PR I C 7 0531 31 80 G PADL B R 3D PGABG 0323 1 P PRCDC P F FG01FG01 004 05 01 PGAB P 5700 P PR 1 0322 PADL0101 07 G 0419200 PPRP 0 PGABG01 0600100 1 ABG01 0710500 PPRFG0 PGABG 241 PADL PGABG01 100 11200 071100 0

PADL0 PB PGAB 1 PBLAC LACG01 000 G01 1 P G 0 0010 PA 200 BLACG0 01 D 04 0411300 550 10 L01 05 11100 0 0 P 1 041 000 GABG01 3 0 1 2200 19 900 0 80020 0 G 0 00 PGABG 0 F 58 01 009 12 R FG 01 0 PG PGABG013 04 P 021 ABG01 0 900 P PPR 900 PPRFG01PPRFG 125630001 0 031400 PPRFG010 00000 P 02120001000 8 GABG0 RFG01 0 087 1 PP P 0008 0 GAB 7 PPRFG01 7900 400 PGAB PGABG01G01 0 0 00 100 CDC 1039000G01 090 300 1 40 PR PR 1 0800 1 04 00 G01 001350 0700 PRG0 CG0 5 02330 PRG01 1400300 IL 0401 PGABG 0 0 G01 0305300 PB PG 01 07 PR LCG01 PBLAC ABG01 001450 00100 PBI 0 G01 0418800 0 002070 PGABG CG01 01 041 PBIL PADL01 04 8500 0 19900 1470 PADL01 0042500 100 PBILCG01 04 PGABG 01 1 LCG01 0617800 0 218500 PBI 071290 PGABG01 C 0410300PF3D7 0009400 PRCD PPRFG01 0712300 PADL01 0301200 * PADL01 0011800 00 PRG01 07110 0 PGABG01 0616300 PRG01 041450 PRCDC 1239700 PADL01 0012400 0 PBILCG01 080930 PBLACG01 0600200 PBILCG01 0814500 PBILCG01 0814200 PBLACG01 0324400 0 PBILCG01 0401000 PADL01 120090 200 PF3D7 0808600 PGABG01 1211 PPRFG01 080 1212600 0 PRG01 04180PRG01 0810 9700 PADL01 080800 500 * ADL01 00 P PR 37900 CDC 01 00 0041900 0001 PADL 800 PADL0100 PPR 0 00 058 0 PPRFFG P 001290 002 00 180 01 080F3D PADL01 L01 11 01 53 7 06174 PAD PADL 1 0 G01 990 L0 0 0 00 D 41 PA 89

00

00 100

5

0 *

0 0

10 100 0

80 24 3 22

7 1 0 4

0 0

0300 1

93 87900 G0 20

0 RF 01 0001 0810

0 P FG G

7 3D

0 P R F 00

PF3D7 PF3D7

1 PP

60

100100 PPR 00

0

PF

1 126

0

F3D7 07 042 0

P 3D7

30 PF 041250

D7 D7 0

3 G01

D7 F 41810

0 R 0

3 P

7 P

3 FG01

0 PPR 0

F

D7 0200 D7

0418500

P PF RFG01 3 PP

F

6328

D7 PPRFG01 0 071220

3 PPR 0

P FG01 7

F 100 071

D 15

P 00

3 PP

0 RFG

0 01 PF 0 0

1 100 4118

0 PF3 00

70 PF P

80 D7 04

15 P

0 R PP PF3

01 3 D7

7 7 FG R 1 00

7 D7 FG 24 0

9 7

4

D 01 04 0 11 3D

2 01 0 7

3 0 0

0 F 1 0

03 0

1

PF 4 P

0 24 1830 1

D7 8

0 0

4 F3 00

0 0 0

0 P

PF 0

1 90

0 7

0 PF

0

D 3 3

1

3 D

4 0

0 0

7 09 7 3D

1

7 0

D 7

PF D

3 7 7 PF 3

PF 042 1

00 2

2552 00

1

7 7 PF 3D

F PR 0

P 1300

0

10

0

0 PF 3 12

D7 3D7 G F P 01

3D7 04 PF3D7 1000100 PF3D7 04 2

15 042090070

50 0

0

0 1 0 0 3 1

D7 3 0 PF

P PF3 0 F3D7

05 PF P 0 0100 P PBILBC

F 3 D7 IL

3 D7 C P

P

RFG01 RFG01 P D 0 G0

0029500 P 7 41 G 1

PP R 124006 2 01 0414200 G

RF 4

P 7

P 1 12 0 RFG0 F

G PPRF

1 0

0 2

0

1 1 G 0

8

0

0

6 9

98

8 PRCDC 19 0 7

0

D PF3 0

0 0

0 003650 0

1 PPRFG

P

0 0 1 0 0 100

R PP 5

F3

0 PR PRG01 0

0

P

R PP 1 UpsC 0

D7 CDC PPRF D F3 0

7 7

0

F P 04 4

PP

PF R

G 0 043

1

G 00 0

7 0808700 7 P 1

0 0

21 01 04256 R

FG 1 0

7

1 PR 70

FG 2

0

3D7 3D7 0 P

PF 0

9

01 01 01

0 2

01 001150 01 G

300

5

P P 0 RFG

0 01

3 00

3D7 3D7 R P 0 G

0 0

0

2 P PR RG

F3 G

0

0 0

0 RG0 0

0 1 0

2 8 G 1 01 041

8 11

13 01 04 D 1 1373500 33

0 100 P 1 0

00 04 66 01

7 R 1 P 61 0 7 50 P 0 01 00

C 0 RG01 07 154 0 8

RCDC 8 1

0 DC 11 P 0 0 1

2 PRG0 00 0 P R 0 05

PR PRG 000 G 0

PR RG0 8

1 0 0

1 0 7 0 0 1

P 0 PRCDC 00 1

G0 0 0 F3 3 0 P 0 0 700 6

PRG G

R 0 4 1 7 81 0 D7 071280 D7 1 0 1 0 18 3

1 0 P CDC 1239 0 0 06 0 040350 0 PF 1 R P PR 50 4 0

PRG0 CD 4 4 0 0 01 081190 RG 0 3D 25 1 2 P 0

PRCDC 0710 PRCDC 41740 G 4 6

F3D7 0 3 1 1 PRG0

7 C 4 7 7

PRG01 0710600 P 0 0 0 0 PR 0 0

P 01 04 500

PF3D7 0413 PF3D7 R 1 0001 0 0

71 PRCDC 00 P

PR R

12 1 0417600 G0 G 0 600

2 RG

0 G

P 0 4 300 40400 P 0 P 25 G R 01 0417500 PRG0 1 6

PRG0 162 1 01 RG P

RG01 0

0 P C

0 0 RG0 07 0 7 0

F 1

14 71 00

D 3 0 3D7 1200400 3D7 PRG0 00 01 0711200 1 0 416000 P. falciparum 3D7

PRG01 0 C

P 15 5

PR 1 41

PPRF 1 04 00 1 1 60 RC

PR

00 0

040280

7 0810 00 P 0

10 7 0 P

PF 350

G01 G01 0 1 1

PRFG0 0

039 7

P 200

G0 1 D CDC

P PRG01

RFG0 17 1 PRG01 PRG01

0

3

P

1 C 1

D 0416

P RF 0

36 08 R PP

P 0013700 3 PPR 0 0

7 PRG0

F

040440

F3 003630

P 0

04 P. praefalciparum 00 0 3D7 3D7

G 0

1 0 0 PR

0

50 0 0

0

P

D P

0

00 1

F

FG01 FG01

2 500 04

08

PRFG01 0062 PRFG01

7 G01 1256400 G01 11 PF3D7 PF3D7

0 R 080

F

01600 0

062900 06

0 PPRF

PP

67 P PPRFG01 PPRFG01 1

G0 2 071040

PRG01 PRG01 P G

1

0 4179

6 41

0

030 900

P

04150 RG01 04 PPRFG01 0800100 00

PR 0

3 0 0

R 002

R

2 PP PPR 1

FG01 00 FG01

0 5

001 7

200 0600

F PPRFG01

G 0

0 PF3D7

G 00

9 00

G01 041280 G01

0 R PPRFG 8 P. reichenowi G01/CDC

0 600 1 RFG0 PP

01

1

86 PR

FG F 04 1 0

G01 002780 1

0

00

0

80 8 00349

P

08 1

G0 3440 01

0

PR 41 8 15

0

0 3660

1 0403

0 110010 712400

RF

PP 00

FG01 00 FG01

13

0

1 0005 0

00

00 P. blacklocki

0939

6 75 0 0 G

1 0 0

0

0

0 00

1 1 7

00 587

00

0

00 0

53

680 P. billcollinsi

P

0

R

C D

0.20 C P. gaboni G01 0

UpsB 0

1

1

6 0 0 P. alderi Fig. 4 Maximum-likelihood phylogenetic tree of 0.5–1.5 kb upstream regulatory regions of var genes from seven Laverania species. The annotation number for each gene (from Plasmodb.org) is shown and the colour of the text signifes the species, as shown in the lower right panel. Sequences of the UpsB type are shaded in pink, UpsA in light blue, UpsC in yellow, var2csa in gray and var1csa in dark blue. Five genes with atypical upstream sequences from P. falciparum that segregate outside of the UpsA/B/C/D/E groups are marked with an asterisk. The evolutionary history was inferred using the Maximum Likelihood method and Jukes–Cantor model [63], with bootstrap values for 1000 replicates shown for various nodes. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Evolutionary analyses were conducted in MEGA X [64]

To more closely investigate the origins of the Ups pro- reichenowi, with no sequences from the other four spe- moter type organization, we combined the var upstream cies represented within these branches. Tis indicates regulatory sequences from all seven Laverania species that the UpsA and B promoter types evolved at the root and constructed a single maximum-likelihood phyloge- of the branch of the Laverania that includes P. praefalci- netic tree (Fig. 4). Of particular interest are the branches parum, P. reichenowi and P. falciparum, coinciding with that represent the UpsA and B sequences. Both of these the evolution of var1csa and var2csa into distinct, strain var regulatory types segregate into groups that are rela- and species transcendent genes. UpsC sequences are tively distant from the other members of the var gene similarly found within a distinct branch that includes P. family. More importantly, these branches only include falciparum, P. praefalciparum and P. reichenowi, but this sequences from P. falciparum, P. praefalciparum and P. group also includes two sequences from P. billcollinsi, Gross et al. BMC Ecol Evo (2021) 21:139 Page 10 of 15

P. alderi (gorilla) A B

Gorilla (P. praefalciparum) P. gaboni (chimpanzee) gorilla-to-human transmission * * Human (P. falciparum) P. blacklocki (gorilla) 8-9 Mya parasite loss var genes, var intron structure, Chimpanzee (P. reichenowi) PTEF P. billcollinsi (chimpanzee)

Bonobo (P. lomamiensis) P. reichenowi (chimpanzee)

* P. lomamiensis (bonobo) Late Miocene expansion of savannah shift in CIDR/DBL P. praefalciparum (gorilla) structure, evolution * of UpsC

evolution of var1csa, P. falciparum (human) var2csa, and UpsA/B Fig. 5 Phylogenetic trees marking key moments in the evolution of the var gene family. A The evolutionary relationships of the Laverania are shown, with each parasite species and its ape host denoted. var genes, the regulatory intronic element and the putative regulatory protein PTEF are thought to have originated in the common ancestor of the entire subgenus (shown by green asterisk). A signifcant shift in the structure of the var encoded protein EMP1 and the UpsC upstream regulatory type occurred in the common ancestor of the bottom fve species (shown by red asterisk) while the evolution of the conserved genes var1csa and var2csa, as well as the UpsA and B types, occurred prior to the branch that includes the human parasite P. falciparum (shown by purple asterisk). B The codivergence model of parasite speciation is displayed showing the branch of the Laverania that evolved var1csa, var2csa and the UpsA/B/C organization. The phylogenetic tree shows the evolution of gorillas, humans, chimpanzees and bonobos from a common ancestor ~ 8–9 million years ago. This model presumes that parasites species diverged along with their hosts, resulting in P. praefalciparum, P. reichenowi and P. lomamiensis infecting gorillas, chimpanzees and bonobos, respectively. Humans are hypothesized to have lost their original parasites, then become reinfected through a recent gorilla-to-human transmission event, resulting in P. falciparum. These ape species (and their parasites) initially diverged during the late Miocene, a period of major climatic change in Africa marked by aridifcation and a regional shift from rainforests to grasslands and savannah

indicating that the UpsC type evolved somewhat earlier that it too might be derived from an ancient var gene than either UpsA or B. type that is no longer found in P. falciparum. In addi- Interestingly, while the highly conserved aspects of tion, the Ups regions from fve P. falciparum var genes var2csa, including the presence of the uORF in the 5’ (Pf3D7_0809100, Pf3D7_1240300, Pf3D7_0617400, UTR of the transcript, its location in the subtelom- Pf3D7_0712900, Pf3D7_080600) segregate within eric region of chromosome 12 and the structure of the branches apart from the UpsA/B/C organization that encoded form of EMP1, all originated in the common includes the rest of the P. falciparum var gene family. ancestor of P. falciparum, P. praefalciparum and P. reiche- Four of these Ups regions were previously noted by Lavs- nowi (see Fig. 3), the upstream region of var2csa, also tsen et al. to have atypical sequences [45], and they are called UpsE, invariable clusters with var genes from the all located at the boundaries between var gene contain- more distant Laverania species (Fig. 4). Tis indicates ing chromosomal segments and the core genome, within that var2csa evolved from a type of var gene that is no the interior regions of the chromosomes. Genes at such longer found in P. falciparum, thus explaining why it positions are likely to be more constrained in their abil- departs from the UpsA/B/C organization. Tis is consist- ity to undergo recombination with other family members ent with the conclusion of Otto and colleagues, based on due to the adjacent unique, highly conserved sequences, the structure of the encoded protein, that var2csa is the thus preventing them from diverging as rapidly. Tis is sole remaining example in P. falciparum of an ancient var consistent with the clustering of these Ups regions with gene [12]. We similarly found that the upstream region sequences from the more distantly related P. reichenowi of var1csa (called UpsD) also segregates with genes and P. praefalciparum rather than with other P. falcipa- from the more distant Laverania species, suggesting rum var genes. Gross et al. BMC Ecol Evo (2021) 21:139 Page 11 of 15

Discussion circulating populations would result in much more rapid Both the evolution of the UpsA and B var upstream regu- immune exposure of the entire var repertoire. Terefore, latory domain subtypes and the development of the spe- we hypothesize that the evolution of var1csa, var2csa and cies-transcendent, conserved genes var1csa and var2csa the UpsA/B/C/D/E organization of the var gene family arose in the common ancestor of P. praefalciparum, enabled parasite populations to maintain more lengthy P. falciparum and P. reichenowi as well as P. lomamien- infections, although why this trait would have been under sis [65], a parasite of bonobos with insufcient genome such strong selection pressure specifcally in the common sequence data to be included in the analysis described ancestor of this clade of the Laverania is not immediately here (Fig. 5A). Our analysis suggests that a strong selec- apparent. tive pressure on parasites at this particular moment in the To better understand what might have given rise to evolutionary history of the Laverania led to the evolu- the evolution of a more coordinated mechanism of var tion of var1csa, var2csa and the UpsA/B/C organization gene transcriptional regulation, we considered the cur- of the var gene family. Further, this selection pressure rent models for the speciation of the Laverania. Using appears to have been sufciently powerful and sustained estimates of mutation rates and Bayesian whole-genome to maintain these changes as the subsequent parasite estimates, Otto and colleagues dated the origins of the species diverged, despite the hyper-recombinogenic Laverania to 0.7–1.2 million years ago, and the common nature of the var gene family and the plasticity of the sub- ancestor of P. reichenowi/P. praefalciparum/P. falcipa- telomeric regions of the genome, two characteristics that rum to 140–230 thousand years ago [12]. Others have would be predicted to rapidly disrupt this organization in also dated speciation events within the Laverania to the absence of strong selection. Tus, the selection pres- timepoints in the relatively recent past [68]. In contrast, sure that led to this change in var gene organization likely Hahn and colleagues proposed a much older origin of represents an exceptionally strong infuence that shaped this parasite clade. Teir analysis indicates that the spe- the genome of the most virulent of the human malaria cies within this branch of the Laverania arose through parasite species. codivergence with their ape hosts, beginning with the Te functional signifcance of the development of the ancestors of gorillas and chimpanzees approximately var1csa/var2csa/UspA/B/C organization of the var gene 8–9 million years ago (Fig. 5B) [65, 69]. Tis model sug- family is unknown but is likely to involve how the fam- gests that humans lost their initial Laverania parasites ily is transcriptionally regulated. Mathematical modelling after diverging from chimpanzees, and subsequently has previously proposed that coordination of var gene acquired P. falciparum through a relatively recent gorilla- transcriptional switching might be achieved through to-human transmission event [70]. Te latter model is central organizing var genes that serve as nodes within particularly intriguing since it places the common ances- a switching network [66], and additional work has impli- tor of this parasite clade within the late Miocene period, cated var2csa in this function [19, 20]. If this model a time of substantial climatic and environmental change holds, var1csa might similarly contribute to coordinating on the African continent that is known to have contrib- var gene switching patterns, thereby providing an expla- uted to ape speciation [71]. Specifcally, the continent nation for its universal conservation despite a timing of experienced progressive cooling and aridifcation, result- transcription inconsistent with the production of EMP1 ing in the contraction of continent-wide rainforests and a and its annotation as a pseudogene. Te divergence of drastic expansion of grasslands and savannah (reviewed var upstream transcriptional regulatory domains into in [72]). Tis change in environmental conditions would UpsA/B/C/D/E types could provide additional structure have created large geographical ranges with seasonal to the switching network, thereby organizing the family transmission, including lengthy time periods in which into a hierarchy of genes with diferent switching rates. malaria transmission is greatly reduced or eliminated, for Tis would infuence the probability of specifc genes example dry seasons when the mosquito vector can be being activated at diferent time points of an infection or absent for months. In order to sustain transmission under in the presence of diferent degrees of host immunity, as such conditions, parasites would have been required to has been observed [59, 60, 67], and would contribute to extend individual infections to span such extended time a coordinated order of gene activation. More sophisti- intervals when transmission is absent. Tis is precisely cated coordination of var gene switching has been pro- what more coordinated var gene transcriptional switch- posed as a way for parasite populations that can number ing patterns have been proposed to do [66], and provides in the billions of individuals to limit var gene expres- a compelling hypothesis for the selection pressure that sion to a single or very small number of var genes at any underlies the origin and maintenance of var1csa/var2csa/ given time [66]. In contrast, uncoordinated, random var UpsA/B/C/D/E organization of the gene family. While gene switching by individual parasites within such large the current geographical distribution of both host and Gross et al. BMC Ecol Evo (2021) 21:139 Page 12 of 15

parasite species have been documented [65, 69], the each region based on base content as described by Cal- selective pressures contributing to these distributions derwood et al. [41]. Te percentage of each base on the have not been identifed, and may or may not refect the positive strand was calculated per region, averaged, and selective pressures in place when the var1csa/var2csa/ graphed using Prism (www.​graph​pad.​com). UpsA/B/C/D/E organization evolved. Similarly, the chro- nicity of infections with species other than P. falciparum Presence of regulatory proteins, var1csa, and var2csa have not been rigorously investigated. Tus, this hypoth- in the Laverania Clade esis remains speculative. Te histone methyltransferases SET2 [21, 22] and SET10 In conclusion, the strategy of avoiding splenic clear- [23], the demethylase JmJC1 [22, 24], the histone dea- ance through cytoadherence, a phenomenon that lies at cetylases SIR2A and SIR2B [25–27], the translation factor the heart of the exceptional virulence of P. falciparum in PTEF [28] and the RECQ helicases RQ1 and WRN [29, humans, appears to have evolved in the common ances- 30] were chosen based on their proposed roles in the reg- tor of the entire Laverania subgenus. Along with the ulation of var gene expression. Te amino acid sequences var gene family and the cytoadhesive protein EMP1, the and the genomic position of each gene from the 3D7 ref- basic elements for transcriptional regulation of this fam- erence genome of P. falciparum were obtained from the ily, including the proteins implicated in the epigenetic EupathDb database (www.​plasm​odb.​org) and analysed regulation of transcriptional activation, silencing and using the Basic Local Alignment Search Tool (BLAST; mutually exclusive expression, were present in the com- ncbi.nlm.gov). Te top hit corresponded to the ortholo- mon ancestor of all Laverania. Tis also includes the reg- gous gene in each species, and the syntenic chromosomal ulatory element found in var introns and its associated location for each gene was verifed. With the exception noncoding RNAs. Tus, strong cytoadhesion and tightly of PTEF in P. blacklocki (incomplete sequence assembly), regulated antigenic variation based on mutually exclusive every gene resided in the syntenic location based on the var gene expression are universal characteristics of these 3D7 genome. For var1csa and var2csa, the same proce- parasites and represent an ancient, conserved evolution- dure was performed. ary adaptation of the Laverania. However, not all aspects of var gene regulation and cytoadhesion through EMP1 Upstream var regulatory region analysis expression have remained unchanged and at least two Collecting sequences: Analysis of var upstream sequences signifcant events have helped shape the evolution of the was performed as described by Kraemer and Smith [57]. most virulent of the malaria species that infect humans Te var genes from each species were compiled by down- (Fig. 5A). In addition to changes in var transcriptional loading 1.5 kb upstream of each “PfEMP1” or “EMP1” regulation, Otto and colleagues previously documented labeled gene on PlasmoDB. Te number of compiled a signifcant alteration in EMP1 structure and speculated sequences were compared with the number of genes that this could indicate a change in the host cell sur- reported by Otto et al. [12], and in all cases, the initial face receptors used for cytoadherence [12]. How these number of genes we collected exceeded the number pre- changes in cytoadhesive properties afected interactions viously reported. Each gene was reevaluated based on the with their hosts is not known, however they have been following criteria: Genes with upstream sequences below incorporated into a highly successful human pathogen 500 bp or that had a truncated exon 1 were removed. that continues to cause substantial morbidity and mor- Similarly, genes with greater than 2 introns or that con- tality throughout the developing world. Understanding tained ambiguous nucleotides labeled as “N” in the the key evolutionary events that shaped this subgenus of upstream sequences were discarded unless the ambigu- parasites provides valuable insights into the pathogenesis ous nucleotides could be deleted from the 5’ end of the and persistence of malaria caused by P. falciparum. sequence without going below 500 bp. Genes on unas- Methods sembled chromosomes were noted. A full description of the sequences from each species that were included var Intron structure analysis in the analyses is provided in Additional fle 2, and a description of the number of genes found for each spe- Analysis of var intron base pair content was performed cies analyzed in this study compared to that described by as described by Calderwood et al. [41]. Briefy, 10 var Otto et al. [12] is provided in Additional fle 1, Table 1. genes were randomly selected for each species in the Laverania clade from the genome sequence database at EupathDB.org. Each gene was uploaded as an individual fle into the sequence analysis program SnapGene (www.​ snapg​ene.​com) and the intron was manually divided into Gross et al. BMC Ecol Evo (2021) 21:139 Page 13 of 15

Determining Chromosomal Location and Orienta- through the databases originally described by Otto et al., Nature Microbiol- ogy, 2018. These sequences were submitted to EBI, project ID PRJEB13584 tion: Chromosomal location and orientation for each var (secondary study accession: ERP015144). These can also be accessed via ftp://​ gene was determined as described by Lavstsen et al. [45] ftp.​sanger.​ac.​uk/​pub/​proje​ct/​patho​gens/​Plasm​odium/​Laver​ania/. and Kraemer et al. [57]. Subtelomeric regions included sequences within 100 kb of either chromosome end Declarations while sequences between the two subtelomeric regions Ethics approval and consent to participate were annotated as being within a chromosomal internal Not applicable. region. Te orientation of subtelomeric genes was deter- mined based on direction of transcription, either toward Consent for publication Not applicable. or away from the adjacent telomere. For genes located on unassembled chromosomes, chromosomal location Competing interests could sometimes be inferred by locating the nearest con- The authors declare that they have no competing interests. served, single copy gene, then determining the chromo- Received: 28 April 2021 Accepted: 2 July 2021 some position of its orthologous gene within the 3D7 reference genome sequence. Constructing maximum-likelihood phylogenetic trees: Te collected sequences for the var upstream regions References from each species were aligned via Clustal Omega 1. Galen SC, Borner J, Martinsen ES, Schaer J, Austin CC, West CJ, Perkins SL. The polyphyly of Plasmodium: comprehensive phylogenetic analyses of (https://​www.​ebi.​ac.​uk/​Tools/​msa/​clust​alo/) [73] and the the malaria parasites (order ) reveal widespread taxonomic resulting FASTA sequences were analyzed using MEGA- confict. R Soc Open Sci. 2018;5: 171780. X [64], a molecular evolutionary genetics analysis tool 2. WHO. 2018. World malaria report 2018. Organization WH, 3. Miller LH, Baruch DI, Marsh K, Doumbo OK. The pathogenic basis of (https://​www.​megas​oftwa​re.​net/). Te ML tree was gen- malaria. Nature. 2002;415:673–9. erated with the Jukes-Cantor Model (63) and 1000 boot- 4. Kwiatkowski DP. How malaria has afected the human genome and strap replicates. Te ML Heuristic Method used was what human genetics can teach us about malaria. Am J Hum Genet. 2005;77:171–92. Nearest-Neighbor-Interchange (NNI) and BioNJ. 5. Templeton TJ. The varieties of gene amplifcation, diversifcation and hypervariability in the human malaria parasite, . Supplementary Information Mol Biochem Parasitol. 2009;166:109–16. The online version contains supplementary material available at https://​doi.​ 6. Wahlgren M, Goel S, Akhouri RR. Variant surface antigens of Plasmo- org/​10.​1186/​s12862-​021-​01872-z. dium falciparum and their roles in severe malaria. Nat Rev Microbiol. 2017;15:479–91. 7. Pasternak ND, Dzikowski R. PfEMP1: an antigen that plays a key role in the Additional fle 1: Maximum-likelihood phylogenetic trees of the 0.5– pathogenicity and immune evasion of the malaria parasite Plasmodium 1.5 kb upstream regulatory regions of var genes from all seven Laverania falciparum. Int J Biochem Cell Biol. 2009;41:1463–6. species and Collected Ups sequences from all species compared with the 8. Deitsch KW, Dzikowski R. Variant gene expression and antigenic variation number of genes reported by Otto et al., 2018. by malaria parasites. Annu Rev Microbiol. 2017;71:625–41. 9. Chen DS, Barry AE, Leliwa-Sytek A, Smith TA, Peterson I, Brown SM, Additional fle 2: Description of individual sequences analyzed. This fle Migot-Nabias F, Deloron P, Kortok MM, Marsh K, Daily JP, Ndiaye D, Sarr O, provides a description of each sequence analyzed, organized according to Mboup S, Day KP. A molecular epidemiological study of var gene diversity species. Specifc characteristics for each sequence are provided. to characterize the reservoir of Plasmodium falciparum in humans in Africa. PLoS ONE. 2011;6: e16629. 10. Childs LM, Buckee CO. Dissecting the determinants of malaria chronicity: Acknowledgements why within-host models struggle to reproduce infection dynamics. J R The authors would like to acknowledge the support and encouragement of Soc Interface. 2015;12:20141379. the Student Research Program of the Bronx High School of Science to RH. 11. Sama W, Killeen G, Smith T. Estimating the duration of Plasmodium falci- parum infection from trials of indoor residual spraying. Am J Trop Med Authors’ contributions Hyg. 2004;70:625–34. MRG and KWD designed the study. MRG and RH obtained the sequences 12. Otto TD, Gilabert A, Crellen T, Bohme U, Arnathau C, Sanders M, Oyola SO, and performed the analyses. MRG, RH and KWD wrote and approved the fnal Okouga AP, Boundenga L, Willaume E, Ngoubangoye B, Moukodoum ND, manuscript. Paupy C, Durand P, Rougeron V, Ollomo B, Renaud F, Newbold C, Berriman M, Prugnolle F. Genomes of all known members of a Plasmodium subge- Funding nus reveal paths to virulent human malaria. Nat Microbiol. 2018;3:687–97. This project was funded by National Institutes of Health grants AI52390 and 13. Chookajorn T, Dzikowski R, Frank M, Li F, Jiwani AZ, Hartl DL, Deitsch KW. AI99327 to KWD. The Department of Microbiology and Immunology at Weill Epigenetic memory at malaria virulence genes. Proc Natl Acad Sci U S A. Medical College of Cornell University acknowledges the support of the Wil- 2007;104:899–902. liam Randolph Hearst Foundation. 14. Flueck C, Bartfai R, Volz J, Niederwieser I, Salcedo-Amaya AM, Alako BT, Ehlgen F, Ralph SA, Cowman AF, Bozdech Z, Stunnenberg HG, Voss TS. Availability of data and materials Plasmodium falciparum heterochromatin protein 1 marks genomic loci The datasets analysed during the current study are available in the Eupathdb linked to phenotypic variation of exported virulence factors. PLoS Pathog. database (Plasmodb.org). In addition to the datasets available through the 2009;5: e1000569. Eupathdb database, additional genome sequence data were obtained Gross et al. BMC Ecol Evo (2021) 21:139 Page 14 of 15

15. Perez-Toledo K, Rojas-Meza AP, Mancio-Silva L, Hernandez-Cuevas NA, 32. Bancells C, Deitsch KW. A molecular switch in the efciency of transla- Delgadillo DM, Vargas M, Martinez-Calvillo S, Scherf A, Hernandez-Rivas tion reinitiation controls expression of var2csa, a gene implicated in R. Plasmodium falciparum heterochromatin protein 1 binds to tri-methyl- pregnancy-associated malaria. Mol Microbiol. 2013. https://​doi.​org/​10.​ ated histone 3 lysine 9 and is linked to mutually exclusive expression of 1111/​mmi.​12379. var genes. Nucleic Acids Res. 2009;37:2596–606. 33. Salanti A, Dahlback M, Turner L, Nielsen MA, Barfod L, Magistrado P, 16. Otto TD, Bohme U, Sanders M, Reid A, Bruske EI, Dufy CW, Bull PC, Pear- Jensen AT, Lavstsen T, Ofori MF, Marsh K, Hviid L, Theander TG. Evidence son RD, Abdi A, Dimonte S, Stewart LB, Campino S, Kekre M, Hamilton for the involvement of VAR2CSA in pregnancy-associated malaria. J Exp WL, Claessens A, Volkman SK, Ndiaye D, Amambua-Ngwa A, Diakite M, Med. 2004;200:1197–203. Fairhurst RM, Conway DJ, Franck M, Newbold CI, Berriman M. Long read 34. Kaur C, Kumar M, Patankar S. Messenger RNAs with large numbers of assemblies of geographically dispersed Plasmodium falciparum isolates upstream open reading frames are translated via leaky scanning and reveal highly structured subtelomeres. Wellcome Open Res. 2018;3:52. reinitiation in the asexual stages of Plasmodium falciparum. Parasitology. 17. Pilosof S, He Q, Tiedje KE, Ruybal-Pesantez S, Day KP, Pascual M. Competi- 2020;147:1100–13. tion for hosts modulates vast antigenic diversity to generate persistent 35. Su X, Heatwole VM, Wertheimer SP, Guinet F, Herrfeldt JV, Peterson strain structure in Plasmodium falciparum. PLoS Biol. 2019;17: e3000336. DS, Ravetch JV, Wellems TE. A large and diverse gene family (var) 18. Salanti A, Staalsoe T, Lavstsen T, Jensen ATR, Sowa MPK, Arnot DE, Hviid encodes 200–350 kD proteins implicated in the antigenic variation and L, Theander TG. Selective upregulation of a single distinctly structured cytoadherence of Plasmodium falciparum-infected erythrocytes. Cell. var gene in chondroitin sulphate A-adhering Plasmodium falciparum 1995;82:89–100. involved in pregnancy-associated malaria. Mol Microbiol. 2003;49:179–91. 36. Epp C, Li F, Howitt CA, Chookajorn T, Deitsch KW. Chromatin associated 19. Ukaegbu UE, Zhang X, Heinberg AR, Wele M, Chen Q, Deitsch KW. A sense and antisense noncoding RNAs are transcribed from the var gene unique virulence gene occupies a principal position in immune evasion family of virulence genes of the malaria parasite Plasmodium falciparum. by the malaria parasite Plasmodium falciparum. PLoS Genet. 2015;11: RNA. 2009;15:116–27. e1005234. 37. Amit-Avraham I, Pozner G, Eshar S, Fastman Y, Kolevzon N, Yavin E, 20. Mok BW, Ribacke U, Rasti N, Kironde F, Chen Q, Nilsson P, Wahlgren M. Dzikowski R. Antisense long noncoding RNAs regulate var gene activa- Default pathway of var2csa switching and translational repression in tion in the malaria parasite Plasmodium falciparum. Proc Natl Acad Sci U S Plasmodium falciparum. PLoS ONE. 2008;3: e1982. A. 2015;112:E982–91. 21. Jiang L, Mu J, Zhang Q, Ni T, Srinivasan P, Rayavara K, Yang W, Turner L, 38. Deitsch KW, Calderwood MS, Wellems TE. Malaria. Cooperative silencing Lavstsen T, Theander TG, Peng W, Wei G, Jing Q, Wakabayashi Y, Bansal A, elements in var genes. Nature. 2001;412:875–6. Luo Y, Ribeiro JM, Scherf A, Aravind L, Zhu J, Zhao K, Miller LH. PfSETvs 39. Dzikowski R, Li F, Amulic B, Eisberg A, Frank M, Patel S, Wellems TE, Deitsch methylation of histone H3K36 represses virulence genes in Plasmodium KW. Mechanisms underlying mutually exclusive expression of virulence falciparum. Nature. 2013;499:223–7. genes by malaria parasites. EMBO Rep. 2007;8:959–65. 22. Ukaegbu UE, Kishore SP, Kwiatkowski DL, Pandarinath C, Dahan-Pasternak 40. Bryant JM, Regnault C, Scheidig-Benatar C, Baumgarten S, Guizetti J, Scherf N, Dzikowski R, Deitsch KW. Recruitment of PfSET2 by RNA polymerase II A. CRISPR/Cas9 genome editing reveals that the intron is not essential for to variant antigen encoding loci contributes to antigenic variation in P. var2csa gene activation or silencing in Plasmodium falciparum. MBio. 2017. falciparum. PLoS Pathog. 2014;10: e1003854. https://​doi.​org/​10.​1128/​mBio.​00729-​17. 23. Volz JC, Bartfai R, Petter M, Langer C, Josling GA, Tsuboi T, Schwach F, 41. Calderwood MS, Gannoun-Zaki L, Wellems TE, Deitsch KW. Plasmodium Baum J, Rayner JC, Stunnenberg HG, Dufy MF, Cowman AF. PfSET10, falciparum var genes are regulated by two regions with separate promoters, a Plasmodium falciparum methyltransferase, maintains the active var one upstream of the coding region and a second within the intron. J Biol gene in a poised state during parasite division. Cell Host Microbe. Chem. 2003;278:34125–32. 2012;11:7–18. 42. Rask TS, Hansen DA, Theander TG, Gorm PA, Lavstsen T. Plasmodium falcipa- 24. Kishore SP, Stiller JW, Deitsch KW. Horizontal gene transfer of epigenetic rum erythrocyte membrane protein 1 diversity in seven genomes—divide machinery and evolution of parasitism in the malaria parasite Plasmo- and conquer. PLoS Comput Biol. 2010. https://​doi.​org/​10.​1371/​journ​al.​pcbi.​ dium falciparum and other apicomplexans. BMC Evol Biol. 2013;13:37. 10009​33. 25. Duraisingh MT, Voss TS, Marty AJ, Dufy MF, Good RT, Thompson JK, 43. Wang CW, Lavstsen T, Bengtsson DC, Magistrado PA, Berger SS, Marquard Freitas-Junior LH, Scherf A, Crabb BS, Cowman AF. Heterochromatin AM, Alifrangis M, Lusingu JP, Theander TG, Turner L. Evidence for in vitro and silencing and locus repositioning linked to regulation of virulence genes in vivo expression of the conserved VAR3 (type 3) plasmodium falciparum in Plasmodium faiciparum. Cell. 2005;121:13–24. erythrocyte membrane protein 1. Malar J. 2012;11:129. 26. Freitas-Junior LH, Hernandez-Rivas R, Ralph SA, Montiel-Condado D, 44. Otto T, Assefa S, Böhme U, Sanders M, Kwiatkowski D, Berriman M, Newbold Ruvalcaba-Salazar OK, Rojas-Meza AP, Mancio-Silva L, Leal-Silvestre RJ, D. Evolutionary analysis of the most polymorphic gene family in falciparum Gontijo AM, Shorte S, Scherf A. Telomeric heterochromatin propaga- malaria. Wellcome Open Res. 2019;4:193. tion and histone acetylation control mutually exclusive expression of 45. Lavstsen T, Salanti A, Jensen ATR, Arnot DE, Theander TG. Sub-grouping of antigenic variation genes in malaria parasites. Cell. 2005;121:25–36. Plasmodium falciparum 3D7 var genes based on sequence analysis of cod- 27. Tonkin CJ, Carret CK, Duraisingh MT, Voss TS, Ralph SA, Hommel M, Dufy ing and non-coding regions. Malar J. 2003;2:27. MF, Silva LM, Scherf A, Ivens A, Speed TP, Beeson JG, Cowman AF. Sir2 46. Trimnell AR, Kraemer SM, Mukherjee S, Phippard DJ, Janes JH, Flamoe E, Su paralogues cooperate to regulate virulence genes and antigenic variation XZ, Awadalla P, Smith JD. Global genetic diversity and evolution of var genes in Plasmodium falciparum. PLoS Biol. 2009;7:e84. associated with placental and severe childhood malaria. Mol Biochem 28. Chan S, Frasch A, Mandava CS, Ch’ng JH, Quintana MDP, Vesterlund M, Parasitol. 2006;148:169–80. Ghorbal M, Joannin N, Franzen O, Lopez-Rubio JJ, Barbieri S, Lanzavecchia 47. Barry AE, Leliwa-Sytek A, Tavul L, Imrie H, Migot-Nabias F, Brown SM, McVean A, Sanyal S, Wahlgren M. Regulation of PfEMP1-VAR2CSA translation by a GA, Day KP. Population genomics of the immune evasion (var) genes of Plas- Plasmodium translation-enhancing factor. Nat Microbiol. 2017;2:17068. modium falciparum. PLoS Pathog. 2007;3:e34. 29. Li Z, Yin S, Sun M, Cheng X, Wei J, Gilbert N, Miao J, Cui L, Huang Z, Dai X, 48. Kraemer SM, Kyes SA, Aggarwal G, Springer AL, Nelson SO, Christodoulou Jiang L. DNA helicase RecQ1 regulates mutually exclusive expression of Z, Smith LM, Wang W, Levin E, Newbold CI, Myler PJ, Smith JD. Patterns of virulence genes in Plasmodium falciparum via heterochromatin alteration. gene recombination shape var gene repertoires in Plasmodium falciparum: Proc Natl Acad Sci U S A. 2019;116:3177–82. comparisons of geographically diverse isolates. BMC Genomics. 2007;8:45. 30. Claessens A, Harris LM, Stanojcic S, Chappell L, Stanton A, Kuk N, 49. Zilversmit MM, Chase EK, Chen DS, Awadalla P, Day KP, McVean G. Veneziano-Broccia P, Sterkers Y, Rayner JC, Merrick CJ. RecQ helicases in Hypervariable antigen genes in malaria have ancient roots. BMC Evol Biol. the malaria parasite Plasmodium falciparum afect genome stability, gene 2013;13:110. expression patterns and DNA replication dynamics. PLoS Genet. 2018;14: 50. Bopp SE, Manary MJ, Bright AT, Johnston GL, Dharia NV, Luna FL, e1007490. McCormack S, Ploufe D, McNamara CW, Walker JR, Fidock DA, Denchi EL, 31. Amulic B, Salanti A, Lavstsen T, Nielsen MA, Deitsch KW. An upstream Winzeler EA. Mitotic evolution of Plasmodium falciparum shows a stable open reading frame controls translation of var2csa, a gene implicated in core genome but recombination in antigen families. PLoS Genet. 2013;9: placental malaria. PLoS Pathog. 2009;5: e1000256. e1003293. Gross et al. BMC Ecol Evo (2021) 21:139 Page 15 of 15

51. Claessens A, Hamilton WL, Kekre M, Otto TD, Faizullabhoy A, Rayner JC, 63. Jukes TH, Cantor CR. Evolution of protein molecules. In: Munro H, editor. Kwiatkowski D. Generation of antigenic diversity in Plasmodium falciparum Mammalian protein metabolism. New York: Academic Press; 1969. p. by structured rearrangement of Var genes during mitosis. PLoS Genet. 21–132. 2014;10: e1004812. 64. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evo- 52. Zhang X, Alexander N, Leonardi I, Mason C, Kirkman LA, Deitsch KW. Rapid lutionary genetics analysis across computing platforms. Mol Biol Evol. antigen diversifcation through mitotic recombination in the human malaria 2018;35:1547–9. parasite Plasmodium falciparum. PLoS Biol. 2019;17: e3000271. 65. Liu W, Sherrill-Mix S, Learn GH, Scully EJ, Li Y, Avitto AN, Loy DE, Lauder 53. Kyes SA, Christodoulou Z, Raza A, Horrocks P, Pinches R, Rowe JA, Newbold AP, Sundararaman SA, Plenderleith LJ, Ndjango JN, Georgiev AV, Ahuka- CI. A well-conserved Plasmodium falciparum var gene shows an unusual Mundeke S, Peeters M, Bertolani P, Dupain J, Garai C, Hart JA, Hart TB, Shaw stage-specifc transcript pattern. Mol Microbiol. 2003;48:1339–48. GM, Sharp PM, Hahn BH. Wild bonobos host geographically restricted 54. Winter G, Chen QJ, Flick K, Kremsner P, Fernandez V, Wahlgren M. The malaria parasites including a putative new Laverania species. Nat Commun. 3D7var5.2 (var(COMMON)) type var gene family is commonly expressed 2017;8:1635. in non-placental Plasmodium falciparum malaria. Mol Biochem Parasitol. 66. Recker M, Buckee CO, Serazin A, Kyes S, Pinches R, Christodoulou Z, Springer 2003;127:179–91. AL, Gupta S, Newbold CI. Antigenic variation in Plasmodium falciparum 55. Sander AF, Salanti A, Lavstsen T, Nielsen MA, Magistrado P, Lusingu J, Ndam malaria involves a highly structured switching pattern. PLoS Pathog. 2011;7: NT, Arnot DE. Multiple var2csa-type PfEMP1 genes located at diferent e1001306. chromosomal loci occur in many Plasmodium falciparum isolates. PLoS ONE. 67. Jensen AT, Magistrado P, Sharp S, Joergensen L, Lavstsen T, Chiucchiuini A, 2009;4: e6667. Salanti A, Vestergaard LS, Lusingu JP, Hermsen R, Sauerwein R, Christensen 56. Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, J, Nielsen MA, Hviid L, Sutherland C, Staalsoe T, Theander TG. Plasmodium Pain A, Nelson KE, Bowman S, Paulsen IT, James K, Eisen JA, Rutherford K, falciparum associated with severe childhood malaria preferentially expresses Salzberg SL, Craig A, Kyes S, Chan MS, Nene V, Shallom SJ, Suh B, Peterson PfEMP1 encoded by group A var genes. J Exp Med. 2004;199:1179–90. J, Angiuoli S, Pertea M, Allen J, Selengut J, Haft D, Mather MW, Vaidya AB, 68. Volkman SK, Barry AE, Lyons EJ, Nielsen KM, Thomas SM, Choi M, Thakore SS, Martin DM, Fairlamb AH, Fraunholz MJ, Roos DS, Ralph SA, McFadden GI, Day KP, Wirth DF, Hartl DL. Recent origin of Plasmodium falciparum from a Cummings LM, Subramanian GM, Mungall C, Venter JC, Carucci DJ, Hofman single progenitor. Science. 2001;293:482–4. SL, Newbold C, Davis RW, Fraser CM, Barrell B. Genome sequence of the 69. Sharp PM, Plenderleith LJ, Hahn BH. Ape origins of human malaria. Annu human malaria parasite Plasmodium falciparum. Nature. 2002;419:498–511. Rev Microbiol. 2020;74:39–63. 57. Kraemer SM, Smith JD. Evidence for the importance of genetic structuring 70. Rayner JC, Liu W, Peeters M, Sharp PM, Hahn BH. A plethora of Plasmo- to the structural and functional specialization of the Plasmodium falciparum dium species in wild apes: a source of human infection? Trends Parasitol. var gene family. Mol Microbiol. 2003;50:1527–38. 2011;27:222–9. 58. Jensen ATR, Magistrado P, Sharp S, Joergensen L, Lavstsen T, Chiucciuini A, 71. Joordens JCA, Feibel CS, Vonhof HB, Schulp AS, Kroon D. Relevance of the Salanti A, Vestergaard LS, Lusingu JP, Hermsen R, Sauerwein R, Christensen J, eastern African coastal forest for early hominin biogeography. J Hum Evol. Nielsen MA, Hviid L, Sutherland C, Staalsoe T, Theander TG. Plasmodium fal- 2019;131:176–202. ciparum associated with severe childhood malaria preferentially expresses 72. Couvreur TLP, Dauby G, Blach-Overgaard A, Deblauwe V, Dessein S, Droissart PfEMP1 encoded by group A var genes. J Exp Med. 2004;199:1179–90. V, Hardy OJ, Harris DJ, Janssens SB, Ley AC, Mackinder BA, Sonke B, Sosef 59. Lavstsen T, Magistrado P, Hermsen CC, Salanti A, Jensen AT, Sauerwein MSM, Stevart T, Svenning JC, Wieringa JJ, Faye A, Missoup AD, Tolley KA, R, Hviid L, Theander TG, Staalsoe T. Expression of Plasmodium falciparum Nicolas V, Ntie S, Fluteau F, Robin C, Guillocheau F, Barboni D, Sepulchre P. erythrocyte membrane protein 1 in experimentally infected humans. Malar Tectonics, climate and the diversifcation of the tropical African terrestrial J. 2005;4:21. fora and fauna. Biol Rev Camb Philos Soc. 2020. https://​doi.​org/​10.​1111/​brv.​ 60. Rottmann M, Lavstsen T, Mugasa JP, Kaestli M, Jensen AT, Muller D, Theander 12644. T, Beck HP. Diferential expression of var gene groups is associated with 73. Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, morbidity caused by Plasmodium falciparum infection in Tanzanian children. Tivey ARN, Potter SC, Finn RD, Lopez R. The EMBL-EBI search and sequence Infect Immun. 2006;74:3904–11. analysis tools APIs in 2019. Nucleic Acids Res. 2019;47:W636–41. 61. Frank M, Dzikowski R, Amulic B, Deitsch K. Variable switching rates of malaria virulence genes are associated with chromosomal position. Mol Microbiol. 2007;64:1486–98. Publisher’s Note 62. Noble R, Christodoulou Z, Kyes S, Pinches R, Newbold CI, Recker M. The Springer Nature remains neutral with regard to jurisdictional claims in pub- antigenic switching network of Plasmodium falciparum and its implications lished maps and institutional afliations. for the immuno-epidemiology of malaria. Elife (Cambridge). 2013;2: e01074.

Ready to submit your research ? Choose BMC and benefit from:

• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions